Single Channel Speech Presence Probability Estimation based on Hybrid Global-Local Information

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

Abstract

Speech presence probability (SPP) estimators work in the short-time Fourier transform domain to give a probability estimate of whether speech is present or absent at each time-frequency bin. Most existing SPP estimators have achieved a high SPP detection accuracy and are deployed successfully in speech enhancement and automatic speech recognition. In this work, we propose a single channel the a posteriori SPP estimator based on hybrid global-local information. In contrast to existing deep neural networks (DNNs) based SPP estimation approaches, our estimator DNN can effectively extract helpful speech representations to estimate SPP with a simpler architecture. Taking hybrid global-local information into account, an encoder is designed to extract high-dimensional global information into a low-dimensional latent space and then concatenate each frequency bin and the latent space to generate the hybrid information. Finally, an SPP decoder is used to decode the hybrid information into the SPP. Experimental results demonstrate that our proposed method provides a more effective way to estimate SPP, which can achieve high SPP estimation accuracy with low computational complexity, especially in low signal-to-noise ratio conditions.
Original languageEnglish
Title of host publicationProceedings of the 2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023
Number of pages5
PublisherIEEE
Publication date22 Oct 2023
Pages1-5
Article number10248067
ISBN (Print)979-8-3503-2373-3
ISBN (Electronic)979-8-3503-2372-6
DOIs
Publication statusPublished - 22 Oct 2023
Event2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) - New Paltz, NY, USA
Duration: 22 Oct 202325 Oct 2023

Conference

Conference2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
LocationNew Paltz, NY, USA
Period22/10/202325/10/2023
SeriesIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
ISSN1947-1629

Keywords

  • deep neural networks
  • hybrid global-local information
  • speech presence probability

Fingerprint

Dive into the research topics of 'Single Channel Speech Presence Probability Estimation based on Hybrid Global-Local Information'. Together they form a unique fingerprint.

Cite this