Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise

Adam Kuklasinski; Simon Doclo; Søren Holdt Jensen; Jesper Jensen

doi:10.1109/TASLP.2016.2573591

Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise

Adam Kuklasinski, Simon Doclo, Søren Holdt Jensen, Jesper Jensen

Department of Electronic Systems

Research output: Contribution to journal › Journal article › Research › peer-review

61 Citations (Scopus)

516 Downloads (Pure)

Abstract

In this contribution we focus on the problem of power spectral density (PSD) estimation from multiple microphone signals in reverberant and noisy environments. The PSD estimation method proposed in this paper is based on the maximum likelihood (ML) methodology. In particular, we derive a novel ML PSD estimation scheme that is suitable for sound scenes which besides speech and reverberation consist of an additional noise component whose second-order statistics are known. The proposed algorithm is shown to outperform an existing similar algorithm in terms of PSD estimation accuracy. Moreover, it is shown numerically that the mean squared estimation error achieved by the proposed method is near the limit set by the corresponding Cram´er-Rao lower bound. The speech dereverberation performance of a multi-channel Wiener filter (MWF) based on the proposed PSD estimators is measured using several instrumental measures and is shown to be higher than when the competing estimator is used. Moreover, we perform a speech intelligibility test where we demonstrate that both the proposed and the competing PSD estimators lead to similar intelligibility improvements.

Original language	English
Journal	I E E E Transactions on Audio, Speech and Language Processing
Volume	24
Issue number	9
Pages (from-to)	1599-1612
Number of pages	14
ISSN	1558-7916
DOIs	https://doi.org/10.1109/TASLP.2016.2573591
Publication status	Published - 1 Sept 2016

Access to Document

10.1109/TASLP.2016.2573591

TASLP2573591Accepted author manuscript, 1.8 MB

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@article{69c00e9e36d94747b2568d9a46036403,

title = "Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise",

abstract = "In this contribution we focus on the problem of power spectral density (PSD) estimation from multiple microphone signals in reverberant and noisy environments. The PSD estimation method proposed in this paper is based on the maximum likelihood (ML) methodology. In particular, we derive a novel ML PSD estimation scheme that is suitable for sound scenes which besides speech and reverberation consist of an additional noise component whose second-order statistics are known. The proposed algorithm is shown to outperform an existing similar algorithm in terms of PSD estimation accuracy. Moreover, it is shown numerically that the mean squared estimation error achieved by the proposed method is near the limit set by the corresponding Cram´er-Rao lower bound. The speech dereverberation performance of a multi-channel Wiener filter (MWF) based on the proposed PSD estimators is measured using several instrumental measures and is shown to be higher than when the competing estimator is used. Moreover, we perform a speech intelligibility test where we demonstrate that both the proposed and the competing PSD estimators lead to similar intelligibility improvements.",

author = "Adam Kuklasinski and Simon Doclo and Jensen, {S{\o}ren Holdt} and Jesper Jensen",

year = "2016",

month = sep,

day = "1",

doi = "10.1109/TASLP.2016.2573591",

language = "English",

volume = "24",

pages = "1599--1612",

journal = "I E E E Transactions on Audio, Speech and Language Processing",

issn = "1558-7916",

publisher = "IEEE Signal Processing Society",

number = "9",

}

TY - JOUR

T1 - Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise

AU - Kuklasinski, Adam

AU - Doclo, Simon

AU - Jensen, Søren Holdt

AU - Jensen, Jesper

PY - 2016/9/1

Y1 - 2016/9/1

N2 - In this contribution we focus on the problem of power spectral density (PSD) estimation from multiple microphone signals in reverberant and noisy environments. The PSD estimation method proposed in this paper is based on the maximum likelihood (ML) methodology. In particular, we derive a novel ML PSD estimation scheme that is suitable for sound scenes which besides speech and reverberation consist of an additional noise component whose second-order statistics are known. The proposed algorithm is shown to outperform an existing similar algorithm in terms of PSD estimation accuracy. Moreover, it is shown numerically that the mean squared estimation error achieved by the proposed method is near the limit set by the corresponding Cram´er-Rao lower bound. The speech dereverberation performance of a multi-channel Wiener filter (MWF) based on the proposed PSD estimators is measured using several instrumental measures and is shown to be higher than when the competing estimator is used. Moreover, we perform a speech intelligibility test where we demonstrate that both the proposed and the competing PSD estimators lead to similar intelligibility improvements.

AB - In this contribution we focus on the problem of power spectral density (PSD) estimation from multiple microphone signals in reverberant and noisy environments. The PSD estimation method proposed in this paper is based on the maximum likelihood (ML) methodology. In particular, we derive a novel ML PSD estimation scheme that is suitable for sound scenes which besides speech and reverberation consist of an additional noise component whose second-order statistics are known. The proposed algorithm is shown to outperform an existing similar algorithm in terms of PSD estimation accuracy. Moreover, it is shown numerically that the mean squared estimation error achieved by the proposed method is near the limit set by the corresponding Cram´er-Rao lower bound. The speech dereverberation performance of a multi-channel Wiener filter (MWF) based on the proposed PSD estimators is measured using several instrumental measures and is shown to be higher than when the competing estimator is used. Moreover, we perform a speech intelligibility test where we demonstrate that both the proposed and the competing PSD estimators lead to similar intelligibility improvements.

U2 - 10.1109/TASLP.2016.2573591

DO - 10.1109/TASLP.2016.2573591

M3 - Journal article

SN - 1558-7916

VL - 24

SP - 1599

EP - 1612

JO - I E E E Transactions on Audio, Speech and Language Processing

JF - I E E E Transactions on Audio, Speech and Language Processing

IS - 9

ER -

Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise

Abstract

Access to Document

AUB Link

Fingerprint

Cite this