Spring til hovednavigation Spring til søgning Spring til hovedindhold

Learning-based a posteriori speech presence probability estimation and applications

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

2 Downloads (Pure)

Abstract

The a posteriori speech presence probability (SPP) plays a critical role in noise power spectral density (PSD) estimation, which is essential for both speech enhancement and recognition systems. While existing SPP estimators perform well under stationary noise conditions, challenges remain in accurately estimating SPP in non-stationary environments and in reducing the computational complexity of deep learning-based methods. In this paper, we build upon a previously proposed hybrid global–local information-based SPP estimation framework and extend its analysis through a comprehensive experimental study. The framework incorporates joint global and local spectral representations and includes targeted refinements aimed at improving robustness in low signal-to-noise ratio (SNR) scenarios. Beyond standalone SPP estimation, the proposed approach is evaluated in downstream applications, including noise PSD estimation and speech enhancement, across multiple datasets and noise conditions. Experimental results demonstrate consistent performance improvements over conventional approaches and provide clear evidence of the robustness and practical effectiveness of the SPP-based framework in non-stationary environments.
OriginalsprogEngelsk
Artikelnummer22
TidsskriftJournal on Audio, Speech, and Music Processing
Vol/bind2026
Udgave nummer1
Antal sider17
ISSN3091-4523
DOI
StatusUdgivet - 2026

Fingeraftryk

Dyk ned i forskningsemnerne om 'Learning-based a posteriori speech presence probability estimation and applications'. Sammen danner de et unikt fingeraftryk.

Citationsformater