Abstract
The a posteriori speech presence probability (SPP) plays a critical role in noise power spectral density (PSD) estimation, which is essential for both speech enhancement and recognition systems. While existing SPP estimators perform well under stationary noise conditions, challenges remain in accurately estimating SPP in non-stationary environments and in reducing the computational complexity of deep learning-based methods. In this paper, we build upon a previously proposed hybrid global–local information-based SPP estimation framework and extend its analysis through a comprehensive experimental study. The framework incorporates joint global and local spectral representations and includes targeted refinements aimed at improving robustness in low signal-to-noise ratio (SNR) scenarios. Beyond standalone SPP estimation, the proposed approach is evaluated in downstream applications, including noise PSD estimation and speech enhancement, across multiple datasets and noise conditions. Experimental results demonstrate consistent performance improvements over conventional approaches and provide clear evidence of the robustness and practical effectiveness of the SPP-based framework in non-stationary environments.
| Originalsprog | Engelsk |
|---|---|
| Artikelnummer | 22 |
| Tidsskrift | Journal on Audio, Speech, and Music Processing |
| Vol/bind | 2026 |
| Udgave nummer | 1 |
| Antal sider | 17 |
| ISSN | 3091-4523 |
| DOI | |
| Status | Udgivet - 2026 |
Fingeraftryk
Dyk ned i forskningsemnerne om 'Learning-based a posteriori speech presence probability estimation and applications'. Sammen danner de et unikt fingeraftryk.Publikation
- 1 Preprint
-
Learning-based A Posteriori Speech Presence Probability Estimation and Applications
Tao, S., Jensen, J. R., Xiang, Y., Reddy, H., Zhang, Q. & Christensen, M. G., 23 jan. 2025, arXiv.Publikation: Working paper/Preprint › Preprint
Åben adgang
Citationsformater
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver