A Framework for Speech Enhancement with Ad Hoc Microphone Arrays

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

Speech enhancement is vital for improved listening practices. Ad hoc microphone arrays are promising assets for this purpose. Most well-established enhancement techniques with conventional arrays can be adapted into ad hoc scenarios. Despite recent efforts to introduce various ad hoc speech enhancement apparatus, a common framework for integration of conventional methods into this new scheme is still missing. This paper establishes such an abstraction based on inter and intra sub-array speech coherencies. Along with measures for signal quality at the input of sub-arrays, a measure of coherency is proposed both for sub-array selection in local enhancement approaches, and also for selecting a proper global reference when more than one sub-array are used. Proposed methods within this framework are evaluated with regard to quantitative and qualitative measures, including array gains, the speech distortion ratio, the PESQ measure, and the STOI intelligibility measure. Major findings in this work are the observed changes in the superiority of different methods for certain conditions. When perceptual quality or intelligibility of the speech are the ultimate goals, there are turning points where the MVDR and the LCMV are superior to Wiener-based methods. Also, for certain scenarios, local approaches may be preferred to global ones.
Close

Details

Speech enhancement is vital for improved listening practices. Ad hoc microphone arrays are promising assets for this purpose. Most well-established enhancement techniques with conventional arrays can be adapted into ad hoc scenarios. Despite recent efforts to introduce various ad hoc speech enhancement apparatus, a common framework for integration of conventional methods into this new scheme is still missing. This paper establishes such an abstraction based on inter and intra sub-array speech coherencies. Along with measures for signal quality at the input of sub-arrays, a measure of coherency is proposed both for sub-array selection in local enhancement approaches, and also for selecting a proper global reference when more than one sub-array are used. Proposed methods within this framework are evaluated with regard to quantitative and qualitative measures, including array gains, the speech distortion ratio, the PESQ measure, and the STOI intelligibility measure. Major findings in this work are the observed changes in the superiority of different methods for certain conditions. When perceptual quality or intelligibility of the speech are the ultimate goals, there are turning points where the MVDR and the LCMV are superior to Wiener-based methods. Also, for certain scenarios, local approaches may be preferred to global ones.
Original languageEnglish
Article number07423739
JournalI E E E Transactions on Audio, Speech and Language Processing
Volume24
Issue number16
Pages (from-to)1038-1051
Number of pages14
ISSN1558-7916
DOI
StatePublished - 2 Mar 2016
Publication categoryResearch
Peer-reviewedYes

    Research areas

  • speech enhancement, microphone array, noise reduction, multichannel, pseudo-coherence vector, ad hoc array
ID: 228128977