Online Multichannel Speech Enhancement Based on Recursive EM and DNN-based Speech Presence Estimation

Juan M. Martín-Doñas, Jesper Jensen, Zheng-Hua Tan, Angel Gomez, Antonio Peinado

Research output: Contribution to journalJournal articleResearchpeer-review

11 Citations (Scopus)
215 Downloads (Pure)

Abstract

This article presents a recursive expectation-maximization algorithm for online multichannel speech enhancement. A deep neural network mask estimator is used to compute the speech presence probability, which is then improved by means of statistical spatial models of the noisy speech and noise signals. The clean speech signal is estimated using beamforming, single-channel linear postfiltering and speech presence masking. The clean speech statistics and speech presence probabilities are finally used to compute the acoustic parameters for beamforming and postfiltering by means of maximum likelihood estimation. This iterative procedure is carried out on a frame-by-frame basis. The algorithm integrates the different estimates in a common statistical framework suitable for online scenarios. Moreover, our method can successfully exploit spectral, spatial and temporal speech properties. Our proposed algorithm is tested in different noisy environments using the multichannel recordings of the CHiME-4 database. The experimental results show that our method outperforms other related state-of-the-art approaches in noise reduction performance, while allowing low-latency processing for real-time applications.

Original languageEnglish
Article number9252844
JournalIEEE/ACM Transactions on Audio, Speech, and Language Processing
Volume28
Pages (from-to)3080-3094
Number of pages15
ISSN2329-9290
DOIs
Publication statusPublished - Dec 2020

Keywords

  • Acoustics
  • Array signal processing
  • Computational modeling
  • Estimation
  • Kalman filter
  • Noise measurement
  • Recursive expectation-maximization
  • Speech enhancement
  • deep neural networks
  • multichannel speech enhancement
  • speech presence probability
  • recursive expectation-maximization
  • Deep neural networks

Fingerprint

Dive into the research topics of 'Online Multichannel Speech Enhancement Based on Recursive EM and DNN-based Speech Presence Estimation'. Together they form a unique fingerprint.

Cite this