A Perceptually Motivated LP Residual Estimator in Noisy and Reverberant Environments

Renhua Peng, Zheng-Hua Tan, Xiaodong Li, Chengshi Zheng

Research output: Contribution to journalJournal articleResearchpeer-review

3 Citations (Scopus)

Abstract

Both reverberation and additive noise can degrade the quality of recorded speech and thus should be suppressed simultaneously. Previous studies have shown that the generalized singular value decomposition (GSVD) has the capability of suppressing the additive noise effectively, but it is not often applied for speech dereverberation since reverberation is considered to be convolutive as well as colored noise. Recently, we revealed that late reverberation is also additive and relatively white interference component in the linear prediction (LP) residual domain. To suppress both late reverberation and additive noise, we have proposed an optimal filter for LP residual estimator (LPRE) based on a constrained minimum mean square error (CMMSE) by using GSVD in single channel speech enhancement, where the algorithm is referred as CMMSE-GSVD-LPRE. Experimental results have shown a better performance of the CMMSE-GSVD-LPRE than spectral subtraction methods, but some residual noise and reverberation components are still audible and annoying. To solve this problem, this paper incorporates the masking properties of the human auditory system in the LP residual domain to further suppress these residual noise and reverberation components while reducing speech distortion at the same time. Various simulation experiments are conducted, and the results show an improved performance of the proposed algorithm. Experimental results with speech recorded in noisy and reverberant environments further confirm the effectiveness of the proposed algorithm in real-world environments.

Original languageEnglish
JournalSpeech Communication
Volume96
Pages (from-to)129-141
Number of pages13
ISSN0167-6393
DOIs
Publication statusPublished - Feb 2018

Fingerprint

Linear Prediction
Reverberation
Generalized Singular Value Decomposition
Singular value decomposition
Estimator
Additive noise
Acoustic noise
Minimum Mean Square Error
Mean square error
Additive Noise
Values
Speech enhancement
Speech Enhancement
Optimal Filter
Colored Noise
Masking
Experimental Results
Subtraction
performance
Prediction

Keywords

  • Auditory masking
  • Generalized singular value decomposition
  • Linear prediction residual
  • Minimum mean square error
  • Speech dereverberation

Cite this

Peng, Renhua ; Tan, Zheng-Hua ; Li, Xiaodong ; Zheng, Chengshi. / A Perceptually Motivated LP Residual Estimator in Noisy and Reverberant Environments. In: Speech Communication. 2018 ; Vol. 96. pp. 129-141.
@article{8ee6d24012c24f23a249cc6b2f13b061,
title = "A Perceptually Motivated LP Residual Estimator in Noisy and Reverberant Environments",
abstract = "Both reverberation and additive noise can degrade the quality of recorded speech and thus should be suppressed simultaneously. Previous studies have shown that the generalized singular value decomposition (GSVD) has the capability of suppressing the additive noise effectively, but it is not often applied for speech dereverberation since reverberation is considered to be convolutive as well as colored noise. Recently, we revealed that late reverberation is also additive and relatively white interference component in the linear prediction (LP) residual domain. To suppress both late reverberation and additive noise, we have proposed an optimal filter for LP residual estimator (LPRE) based on a constrained minimum mean square error (CMMSE) by using GSVD in single channel speech enhancement, where the algorithm is referred as CMMSE-GSVD-LPRE. Experimental results have shown a better performance of the CMMSE-GSVD-LPRE than spectral subtraction methods, but some residual noise and reverberation components are still audible and annoying. To solve this problem, this paper incorporates the masking properties of the human auditory system in the LP residual domain to further suppress these residual noise and reverberation components while reducing speech distortion at the same time. Various simulation experiments are conducted, and the results show an improved performance of the proposed algorithm. Experimental results with speech recorded in noisy and reverberant environments further confirm the effectiveness of the proposed algorithm in real-world environments.",
keywords = "Auditory masking, Generalized singular value decomposition, Linear prediction residual, Minimum mean square error, Speech dereverberation",
author = "Renhua Peng and Zheng-Hua Tan and Xiaodong Li and Chengshi Zheng",
year = "2018",
month = "2",
doi = "10.1016/j.specom.2017.12.004",
language = "English",
volume = "96",
pages = "129--141",
journal = "Speech Communication",
issn = "0167-6393",
publisher = "Elsevier",

}

A Perceptually Motivated LP Residual Estimator in Noisy and Reverberant Environments. / Peng, Renhua; Tan, Zheng-Hua; Li, Xiaodong; Zheng, Chengshi.

In: Speech Communication, Vol. 96, 02.2018, p. 129-141.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - A Perceptually Motivated LP Residual Estimator in Noisy and Reverberant Environments

AU - Peng, Renhua

AU - Tan, Zheng-Hua

AU - Li, Xiaodong

AU - Zheng, Chengshi

PY - 2018/2

Y1 - 2018/2

N2 - Both reverberation and additive noise can degrade the quality of recorded speech and thus should be suppressed simultaneously. Previous studies have shown that the generalized singular value decomposition (GSVD) has the capability of suppressing the additive noise effectively, but it is not often applied for speech dereverberation since reverberation is considered to be convolutive as well as colored noise. Recently, we revealed that late reverberation is also additive and relatively white interference component in the linear prediction (LP) residual domain. To suppress both late reverberation and additive noise, we have proposed an optimal filter for LP residual estimator (LPRE) based on a constrained minimum mean square error (CMMSE) by using GSVD in single channel speech enhancement, where the algorithm is referred as CMMSE-GSVD-LPRE. Experimental results have shown a better performance of the CMMSE-GSVD-LPRE than spectral subtraction methods, but some residual noise and reverberation components are still audible and annoying. To solve this problem, this paper incorporates the masking properties of the human auditory system in the LP residual domain to further suppress these residual noise and reverberation components while reducing speech distortion at the same time. Various simulation experiments are conducted, and the results show an improved performance of the proposed algorithm. Experimental results with speech recorded in noisy and reverberant environments further confirm the effectiveness of the proposed algorithm in real-world environments.

AB - Both reverberation and additive noise can degrade the quality of recorded speech and thus should be suppressed simultaneously. Previous studies have shown that the generalized singular value decomposition (GSVD) has the capability of suppressing the additive noise effectively, but it is not often applied for speech dereverberation since reverberation is considered to be convolutive as well as colored noise. Recently, we revealed that late reverberation is also additive and relatively white interference component in the linear prediction (LP) residual domain. To suppress both late reverberation and additive noise, we have proposed an optimal filter for LP residual estimator (LPRE) based on a constrained minimum mean square error (CMMSE) by using GSVD in single channel speech enhancement, where the algorithm is referred as CMMSE-GSVD-LPRE. Experimental results have shown a better performance of the CMMSE-GSVD-LPRE than spectral subtraction methods, but some residual noise and reverberation components are still audible and annoying. To solve this problem, this paper incorporates the masking properties of the human auditory system in the LP residual domain to further suppress these residual noise and reverberation components while reducing speech distortion at the same time. Various simulation experiments are conducted, and the results show an improved performance of the proposed algorithm. Experimental results with speech recorded in noisy and reverberant environments further confirm the effectiveness of the proposed algorithm in real-world environments.

KW - Auditory masking

KW - Generalized singular value decomposition

KW - Linear prediction residual

KW - Minimum mean square error

KW - Speech dereverberation

UR - http://www.scopus.com/inward/record.url?scp=85038031269&partnerID=8YFLogxK

U2 - 10.1016/j.specom.2017.12.004

DO - 10.1016/j.specom.2017.12.004

M3 - Journal article

VL - 96

SP - 129

EP - 141

JO - Speech Communication

JF - Speech Communication

SN - 0167-6393

ER -