A Framework for Speech Enhancement with Ad Hoc Microphone Arrays

Vincent Mohammad Tavakoli; Jesper Rindom Jensen; Mads Græsbøll Christensen; Jacob Benesty

doi:10.1109/TASLP.2016.2537202

A Framework for Speech Enhancement with Ad Hoc Microphone Arrays

Vincent Mohammad Tavakoli, Jesper Rindom Jensen, Mads Græsbøll Christensen, Jacob Benesty

Research output: Contribution to journal › Journal article › Research › peer-review

36 Citations (Scopus)

141 Downloads (Pure)

Abstract

Speech enhancement is vital for improved listening practices. Ad hoc microphone arrays are promising assets for this purpose. Most well-established enhancement techniques with conventional arrays can be adapted into ad hoc scenarios. Despite recent efforts to introduce various ad hoc speech enhancement apparatus, a common framework for integration of conventional methods into this new scheme is still missing. This paper establishes such an abstraction based on inter and intra sub-array speech coherencies. Along with measures for signal quality at the input of sub-arrays, a measure of coherency is proposed both for sub-array selection in local enhancement approaches, and also for selecting a proper global reference when more than one sub-array are used. Proposed methods within this framework are evaluated with regard to quantitative and qualitative measures, including array gains, the speech distortion ratio, the PESQ measure, and the STOI intelligibility measure. Major findings in this work are the observed changes in the superiority of different methods for certain conditions. When perceptual quality or intelligibility of the speech are the ultimate goals, there are turning points where the MVDR and the LCMV are superior to Wiener-based methods. Also, for certain scenarios, local approaches may be preferred to global ones.

Original language	English
Article number	07423739
Journal	I E E E Transactions on Audio, Speech and Language Processing
Volume	24
Issue number	16
Pages (from-to)	1038-1051
Number of pages	14
ISSN	1558-7916
DOIs	https://doi.org/10.1109/TASLP.2016.2537202
Publication status	Published - 2 Mar 2016

Keywords

speech enhancement
microphone array
noise reduction
multichannel
pseudo-coherence vector
ad hoc array

Access to Document

10.1109/TASLP.2016.2537202

07423739Submitted manuscript, 2.36 MB

http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7423739

AUB Link

Search for the material in Aalborg University Library's search engine

Localization and Tracking of Speech - a Joint Audio-Visual Approach
Jensen, J. R.
01/10/2013 → 30/09/2016
Project: Research
Spatio-Temporal Filtering Methods for Enhancement and Separation of Speech Signals
Christensen, M. G., Nørholm, S. M., Karimian-Azari, S. & Jensen, J. R.
01/08/2012 → 30/06/2015
Project: Research

Cite this

@article{d5f1d8e218f24b7eab491bb4b1ee467d,

title = "A Framework for Speech Enhancement with Ad Hoc Microphone Arrays",

abstract = "Speech enhancement is vital for improved listening practices. Ad hoc microphone arrays are promising assets for this purpose. Most well-established enhancement techniques with conventional arrays can be adapted into ad hoc scenarios. Despite recent efforts to introduce various ad hoc speech enhancement apparatus, a common framework for integration of conventional methods into this new scheme is still missing. This paper establishes such an abstraction based on inter and intra sub-array speech coherencies. Along with measures for signal quality at the input of sub-arrays, a measure of coherency is proposed both for sub-array selection in local enhancement approaches, and also for selecting a proper global reference when more than one sub-array are used. Proposed methods within this framework are evaluated with regard to quantitative and qualitative measures, including array gains, the speech distortion ratio, the PESQ measure, and the STOI intelligibility measure. Major findings in this work are the observed changes in the superiority of different methods for certain conditions. When perceptual quality or intelligibility of the speech are the ultimate goals, there are turning points where the MVDR and the LCMV are superior to Wiener-based methods. Also, for certain scenarios, local approaches may be preferred to global ones.",

keywords = "speech enhancement, microphone array, noise reduction, multichannel, pseudo-coherence vector, ad hoc array",

author = "Tavakoli, {Vincent Mohammad} and Jensen, {Jesper Rindom} and Christensen, {Mads Gr{\ae}sb{\o}ll} and Jacob Benesty",

year = "2016",

month = mar,

day = "2",

doi = "10.1109/TASLP.2016.2537202",

language = "English",

volume = "24",

pages = "1038--1051",

journal = "I E E E Transactions on Audio, Speech and Language Processing",

issn = "1558-7916",

publisher = "IEEE Signal Processing Society",

number = "16",

}

TY - JOUR

T1 - A Framework for Speech Enhancement with Ad Hoc Microphone Arrays

AU - Tavakoli, Vincent Mohammad

AU - Jensen, Jesper Rindom

AU - Christensen, Mads Græsbøll

AU - Benesty, Jacob

PY - 2016/3/2

Y1 - 2016/3/2

N2 - Speech enhancement is vital for improved listening practices. Ad hoc microphone arrays are promising assets for this purpose. Most well-established enhancement techniques with conventional arrays can be adapted into ad hoc scenarios. Despite recent efforts to introduce various ad hoc speech enhancement apparatus, a common framework for integration of conventional methods into this new scheme is still missing. This paper establishes such an abstraction based on inter and intra sub-array speech coherencies. Along with measures for signal quality at the input of sub-arrays, a measure of coherency is proposed both for sub-array selection in local enhancement approaches, and also for selecting a proper global reference when more than one sub-array are used. Proposed methods within this framework are evaluated with regard to quantitative and qualitative measures, including array gains, the speech distortion ratio, the PESQ measure, and the STOI intelligibility measure. Major findings in this work are the observed changes in the superiority of different methods for certain conditions. When perceptual quality or intelligibility of the speech are the ultimate goals, there are turning points where the MVDR and the LCMV are superior to Wiener-based methods. Also, for certain scenarios, local approaches may be preferred to global ones.

AB - Speech enhancement is vital for improved listening practices. Ad hoc microphone arrays are promising assets for this purpose. Most well-established enhancement techniques with conventional arrays can be adapted into ad hoc scenarios. Despite recent efforts to introduce various ad hoc speech enhancement apparatus, a common framework for integration of conventional methods into this new scheme is still missing. This paper establishes such an abstraction based on inter and intra sub-array speech coherencies. Along with measures for signal quality at the input of sub-arrays, a measure of coherency is proposed both for sub-array selection in local enhancement approaches, and also for selecting a proper global reference when more than one sub-array are used. Proposed methods within this framework are evaluated with regard to quantitative and qualitative measures, including array gains, the speech distortion ratio, the PESQ measure, and the STOI intelligibility measure. Major findings in this work are the observed changes in the superiority of different methods for certain conditions. When perceptual quality or intelligibility of the speech are the ultimate goals, there are turning points where the MVDR and the LCMV are superior to Wiener-based methods. Also, for certain scenarios, local approaches may be preferred to global ones.

KW - speech enhancement

KW - microphone array

KW - noise reduction

KW - multichannel

KW - pseudo-coherence vector

KW - ad hoc array

U2 - 10.1109/TASLP.2016.2537202

DO - 10.1109/TASLP.2016.2537202

M3 - Journal article

SN - 1558-7916

VL - 24

SP - 1038

EP - 1051

JO - I E E E Transactions on Audio, Speech and Language Processing

JF - I E E E Transactions on Audio, Speech and Language Processing

IS - 16

M1 - 07423739

ER -

A Framework for Speech Enhancement with Ad Hoc Microphone Arrays

Abstract

Keywords

Access to Document

AUB Link

Fingerprint

Projects

Localization and Tracking of Speech - a Joint Audio-Visual Approach

Spatio-Temporal Filtering Methods for Enhancement and Separation of Speech Signals

Cite this