An Analysis of Traditional Noise Power Spectral Density Estimators Based on the Gaussian Stochastic Volatility Model

Jesper Kjaer Nielsen; Mads Graesboll Christensen; Jesper Bunsow Boldt

doi:10.1109/TASLP.2023.3282107

An Analysis of Traditional Noise Power Spectral Density Estimators Based on the Gaussian Stochastic Volatility Model

Jesper Kjaer Nielsen, Mads Graesboll Christensen^*, Jesper Bunsow Boldt

^*Kontaktforfatter

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › peer review

Abstract

Many single- and multi-channel speech enhancement techniques, old and new, rely in one way or another on estimates of the noise power spectral density (PSD). For example, the classical Wiener filter requires that either the speech or noise PSD be estimated. Typically, the noise PSD is estimated, as it is often easier to model and estimate than the speech. As a result, much attention has been paid to this important problem over the past couple of decades, with important scientific milestones being the minimum statistics (MS), the minima controlled recursive averaging (IMCRA), and the minimum mean squared (MMSE) estimators. Despite leading to major progress, these estimators are rather ad hoc, making them difficult to tune and improve in a systematic manner. In this article, we analyse some of the common heuristics employed in such noise PSD estimators to put them on firmer mathematical ground. More specifically, we use the Gaussian stochastic volatility model and show that the MMSE noise PSD estimator can be interpreted as a special case thereof. Moreover, we analyze the related problem of speech presence probability (SPP) estimation and show that the SPP estimation performed in the MMSE noise PSD estimator can be interpreted as an SNR estimator in the context of the Gaussian stochastic volatility model.

Originalsprog	Engelsk
Tidsskrift	IEEE/ACM Transactions on Audio Speech and Language Processing
Vol/bind	31
Sider (fra-til)	2299-2313
Antal sider	15
ISSN	2329-9290
DOI	https://doi.org/10.1109/TASLP.2023.3282107
Status	Udgivet - 2023

Bibliografisk note

Publisher Copyright:
© 2014 IEEE.

Adgang til dokumentet

10.1109/TASLP.2023.3282107

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Andre filer og links

Link to publication in Scopus

Citationsformater

@article{64a3bdb42fc14dc296bc8b383f21610c,

title = "An Analysis of Traditional Noise Power Spectral Density Estimators Based on the Gaussian Stochastic Volatility Model",

abstract = "Many single- and multi-channel speech enhancement techniques, old and new, rely in one way or another on estimates of the noise power spectral density (PSD). For example, the classical Wiener filter requires that either the speech or noise PSD be estimated. Typically, the noise PSD is estimated, as it is often easier to model and estimate than the speech. As a result, much attention has been paid to this important problem over the past couple of decades, with important scientific milestones being the minimum statistics (MS), the minima controlled recursive averaging (IMCRA), and the minimum mean squared (MMSE) estimators. Despite leading to major progress, these estimators are rather ad hoc, making them difficult to tune and improve in a systematic manner. In this article, we analyse some of the common heuristics employed in such noise PSD estimators to put them on firmer mathematical ground. More specifically, we use the Gaussian stochastic volatility model and show that the MMSE noise PSD estimator can be interpreted as a special case thereof. Moreover, we analyze the related problem of speech presence probability (SPP) estimation and show that the SPP estimation performed in the MMSE noise PSD estimator can be interpreted as an SNR estimator in the context of the Gaussian stochastic volatility model.",

keywords = "Noise PSD estimation, speech enhancement, speech presence probabilities",

author = "Nielsen, {Jesper Kjaer} and Christensen, {Mads Graesboll} and Boldt, {Jesper Bunsow}",

note = "Publisher Copyright: {\textcopyright} 2014 IEEE.",

year = "2023",

doi = "10.1109/TASLP.2023.3282107",

language = "English",

volume = "31",

pages = "2299--2313",

journal = "IEEE/ACM Transactions on Audio Speech and Language Processing",

issn = "2329-9290",

publisher = "IEEE Signal Processing Society",

}

An Analysis of Traditional Noise Power Spectral Density Estimators Based on the Gaussian Stochastic Volatility Model. / Nielsen, Jesper Kjaer; Christensen, Mads Graesboll; Boldt, Jesper Bunsow.
I: IEEE/ACM Transactions on Audio Speech and Language Processing, Bind 31, 2023, s. 2299-2313.

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › peer review

TY - JOUR

T1 - An Analysis of Traditional Noise Power Spectral Density Estimators Based on the Gaussian Stochastic Volatility Model

AU - Nielsen, Jesper Kjaer

AU - Christensen, Mads Graesboll

AU - Boldt, Jesper Bunsow

PY - 2023

Y1 - 2023

N2 - Many single- and multi-channel speech enhancement techniques, old and new, rely in one way or another on estimates of the noise power spectral density (PSD). For example, the classical Wiener filter requires that either the speech or noise PSD be estimated. Typically, the noise PSD is estimated, as it is often easier to model and estimate than the speech. As a result, much attention has been paid to this important problem over the past couple of decades, with important scientific milestones being the minimum statistics (MS), the minima controlled recursive averaging (IMCRA), and the minimum mean squared (MMSE) estimators. Despite leading to major progress, these estimators are rather ad hoc, making them difficult to tune and improve in a systematic manner. In this article, we analyse some of the common heuristics employed in such noise PSD estimators to put them on firmer mathematical ground. More specifically, we use the Gaussian stochastic volatility model and show that the MMSE noise PSD estimator can be interpreted as a special case thereof. Moreover, we analyze the related problem of speech presence probability (SPP) estimation and show that the SPP estimation performed in the MMSE noise PSD estimator can be interpreted as an SNR estimator in the context of the Gaussian stochastic volatility model.

AB - Many single- and multi-channel speech enhancement techniques, old and new, rely in one way or another on estimates of the noise power spectral density (PSD). For example, the classical Wiener filter requires that either the speech or noise PSD be estimated. Typically, the noise PSD is estimated, as it is often easier to model and estimate than the speech. As a result, much attention has been paid to this important problem over the past couple of decades, with important scientific milestones being the minimum statistics (MS), the minima controlled recursive averaging (IMCRA), and the minimum mean squared (MMSE) estimators. Despite leading to major progress, these estimators are rather ad hoc, making them difficult to tune and improve in a systematic manner. In this article, we analyse some of the common heuristics employed in such noise PSD estimators to put them on firmer mathematical ground. More specifically, we use the Gaussian stochastic volatility model and show that the MMSE noise PSD estimator can be interpreted as a special case thereof. Moreover, we analyze the related problem of speech presence probability (SPP) estimation and show that the SPP estimation performed in the MMSE noise PSD estimator can be interpreted as an SNR estimator in the context of the Gaussian stochastic volatility model.

KW - Noise PSD estimation

KW - speech enhancement

KW - speech presence probabilities

UR - http://www.scopus.com/inward/record.url?scp=85161612803&partnerID=8YFLogxK

U2 - 10.1109/TASLP.2023.3282107

DO - 10.1109/TASLP.2023.3282107

M3 - Journal article

AN - SCOPUS:85161612803

SN - 2329-9290

VL - 31

SP - 2299

EP - 2313

JO - IEEE/ACM Transactions on Audio Speech and Language Processing

JF - IEEE/ACM Transactions on Audio Speech and Language Processing

ER -

An Analysis of Traditional Noise Power Spectral Density Estimators Based on the Gaussian Stochastic Volatility Model

Abstract

Bibliografisk note

Adgang til dokumentet

AUB Link

Andre filer og links

Fingeraftryk

Citationsformater