Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise

Research output: Contribution to journalJournal articleResearchpeer-review

25 Citations (Scopus)
125 Downloads (Pure)

Abstract

In this contribution we focus on the problem of power spectral density (PSD) estimation from multiple microphone signals in reverberant and noisy environments. The PSD estimation method proposed in this paper is based on the maximum likelihood (ML) methodology. In particular, we derive a novel ML PSD estimation scheme that is suitable for sound scenes which besides speech and reverberation consist of an additional noise component whose second-order statistics are known. The proposed algorithm is shown to outperform an existing similar algorithm in terms of PSD estimation accuracy. Moreover, it is shown numerically that the mean squared estimation error achieved by the proposed method is near the limit set by the corresponding Cram´er-Rao lower bound. The speech dereverberation performance of a multi-channel Wiener filter (MWF) based on the proposed PSD estimators is measured using several instrumental measures and is shown to be higher than when the competing estimator is used. Moreover, we perform a speech intelligibility test where we demonstrate that both the proposed and the competing PSD estimators lead to similar intelligibility improvements.
Original languageEnglish
JournalI E E E Transactions on Audio, Speech and Language Processing
Volume24
Issue number9
Pages (from-to)1599-1612
Number of pages14
ISSN1558-7916
DOIs
Publication statusPublished - 1 Sep 2016

Fingerprint

Speech enhancement
Reverberation
Power spectral density
reverberation
Acoustic noise
Maximum likelihood
augmentation
estimators
intelligibility
Speech intelligibility
Microphones
microphones
Error analysis
Statistics
statistics
Acoustic waves
methodology
filters
acoustics

Cite this

@article{69c00e9e36d94747b2568d9a46036403,
title = "Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise",
abstract = "In this contribution we focus on the problem of power spectral density (PSD) estimation from multiple microphone signals in reverberant and noisy environments. The PSD estimation method proposed in this paper is based on the maximum likelihood (ML) methodology. In particular, we derive a novel ML PSD estimation scheme that is suitable for sound scenes which besides speech and reverberation consist of an additional noise component whose second-order statistics are known. The proposed algorithm is shown to outperform an existing similar algorithm in terms of PSD estimation accuracy. Moreover, it is shown numerically that the mean squared estimation error achieved by the proposed method is near the limit set by the corresponding Cram´er-Rao lower bound. The speech dereverberation performance of a multi-channel Wiener filter (MWF) based on the proposed PSD estimators is measured using several instrumental measures and is shown to be higher than when the competing estimator is used. Moreover, we perform a speech intelligibility test where we demonstrate that both the proposed and the competing PSD estimators lead to similar intelligibility improvements.",
author = "Adam Kuklasinski and Simon Doclo and Jensen, {S{\o}ren Holdt} and Jesper Jensen",
year = "2016",
month = "9",
day = "1",
doi = "10.1109/TASLP.2016.2573591",
language = "English",
volume = "24",
pages = "1599--1612",
journal = "IEEE/ACM Transactions on Audio, Speech, and Language Processing",
issn = "2329-9290",
publisher = "IEEE Signal Processing Society",
number = "9",

}

Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise. / Kuklasinski, Adam; Doclo, Simon; Jensen, Søren Holdt; Jensen, Jesper.

In: I E E E Transactions on Audio, Speech and Language Processing, Vol. 24, No. 9, 01.09.2016, p. 1599-1612.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise

AU - Kuklasinski, Adam

AU - Doclo, Simon

AU - Jensen, Søren Holdt

AU - Jensen, Jesper

PY - 2016/9/1

Y1 - 2016/9/1

N2 - In this contribution we focus on the problem of power spectral density (PSD) estimation from multiple microphone signals in reverberant and noisy environments. The PSD estimation method proposed in this paper is based on the maximum likelihood (ML) methodology. In particular, we derive a novel ML PSD estimation scheme that is suitable for sound scenes which besides speech and reverberation consist of an additional noise component whose second-order statistics are known. The proposed algorithm is shown to outperform an existing similar algorithm in terms of PSD estimation accuracy. Moreover, it is shown numerically that the mean squared estimation error achieved by the proposed method is near the limit set by the corresponding Cram´er-Rao lower bound. The speech dereverberation performance of a multi-channel Wiener filter (MWF) based on the proposed PSD estimators is measured using several instrumental measures and is shown to be higher than when the competing estimator is used. Moreover, we perform a speech intelligibility test where we demonstrate that both the proposed and the competing PSD estimators lead to similar intelligibility improvements.

AB - In this contribution we focus on the problem of power spectral density (PSD) estimation from multiple microphone signals in reverberant and noisy environments. The PSD estimation method proposed in this paper is based on the maximum likelihood (ML) methodology. In particular, we derive a novel ML PSD estimation scheme that is suitable for sound scenes which besides speech and reverberation consist of an additional noise component whose second-order statistics are known. The proposed algorithm is shown to outperform an existing similar algorithm in terms of PSD estimation accuracy. Moreover, it is shown numerically that the mean squared estimation error achieved by the proposed method is near the limit set by the corresponding Cram´er-Rao lower bound. The speech dereverberation performance of a multi-channel Wiener filter (MWF) based on the proposed PSD estimators is measured using several instrumental measures and is shown to be higher than when the competing estimator is used. Moreover, we perform a speech intelligibility test where we demonstrate that both the proposed and the competing PSD estimators lead to similar intelligibility improvements.

U2 - 10.1109/TASLP.2016.2573591

DO - 10.1109/TASLP.2016.2573591

M3 - Journal article

VL - 24

SP - 1599

EP - 1612

JO - IEEE/ACM Transactions on Audio, Speech, and Language Processing

JF - IEEE/ACM Transactions on Audio, Speech, and Language Processing

SN - 2329-9290

IS - 9

ER -