TY - JOUR
T1 - An Analysis of Traditional Noise Power Spectral Density Estimators Based on the Gaussian Stochastic Volatility Model
AU - Nielsen, Jesper Kjaer
AU - Christensen, Mads Graesboll
AU - Boldt, Jesper Bunsow
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2023
Y1 - 2023
N2 - Many single- and multi-channel speech enhancement techniques, old and new, rely in one way or another on estimates of the noise power spectral density (PSD). For example, the classical Wiener filter requires that either the speech or noise PSD be estimated. Typically, the noise PSD is estimated, as it is often easier to model and estimate than the speech. As a result, much attention has been paid to this important problem over the past couple of decades, with important scientific milestones being the minimum statistics (MS), the minima controlled recursive averaging (IMCRA), and the minimum mean squared (MMSE) estimators. Despite leading to major progress, these estimators are rather ad hoc, making them difficult to tune and improve in a systematic manner. In this article, we analyse some of the common heuristics employed in such noise PSD estimators to put them on firmer mathematical ground. More specifically, we use the Gaussian stochastic volatility model and show that the MMSE noise PSD estimator can be interpreted as a special case thereof. Moreover, we analyze the related problem of speech presence probability (SPP) estimation and show that the SPP estimation performed in the MMSE noise PSD estimator can be interpreted as an SNR estimator in the context of the Gaussian stochastic volatility model.
AB - Many single- and multi-channel speech enhancement techniques, old and new, rely in one way or another on estimates of the noise power spectral density (PSD). For example, the classical Wiener filter requires that either the speech or noise PSD be estimated. Typically, the noise PSD is estimated, as it is often easier to model and estimate than the speech. As a result, much attention has been paid to this important problem over the past couple of decades, with important scientific milestones being the minimum statistics (MS), the minima controlled recursive averaging (IMCRA), and the minimum mean squared (MMSE) estimators. Despite leading to major progress, these estimators are rather ad hoc, making them difficult to tune and improve in a systematic manner. In this article, we analyse some of the common heuristics employed in such noise PSD estimators to put them on firmer mathematical ground. More specifically, we use the Gaussian stochastic volatility model and show that the MMSE noise PSD estimator can be interpreted as a special case thereof. Moreover, we analyze the related problem of speech presence probability (SPP) estimation and show that the SPP estimation performed in the MMSE noise PSD estimator can be interpreted as an SNR estimator in the context of the Gaussian stochastic volatility model.
KW - Noise PSD estimation
KW - speech enhancement
KW - speech presence probabilities
UR - http://www.scopus.com/inward/record.url?scp=85161612803&partnerID=8YFLogxK
U2 - 10.1109/TASLP.2023.3282107
DO - 10.1109/TASLP.2023.3282107
M3 - Journal article
AN - SCOPUS:85161612803
SN - 2329-9290
VL - 31
SP - 2299
EP - 2313
JO - IEEE/ACM Transactions on Audio Speech and Language Processing
JF - IEEE/ACM Transactions on Audio Speech and Language Processing
ER -