Gammatone Filter Bank-Deep Neural Network-based Monaural speech enhancement for unseen conditions

Shoba Sivapatham, Asutosh Kar*, Mads Græsbøll Christensen

*Kontaktforfatter

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

3 Citationer (Scopus)

Abstract

Speech signal enhancement achieves high-level performance in recent years using deep learning techniques. However, the deep learning technique in the speech enhancement algorithm degrades the performance of speech, particularly for unseen noises, unseen speakers and moreover, deep learning models are limited to the small number of speakers. Hence, we propose a Gammatone filterbank (GTFB) – simple deep neural network (SDNN) based speech enhancement algorithm to improve the quality of speech for three different unseen conditions. The use of GTFB gives a finer resolution in low-frequency regions of speech, and the SDNN model extracts a noisy GTFB frame as input and maps it to a clean speech GTFB frame. The experimental results are measured objectively using signal-noise-ratio, perceptual evaluation of speech quality, short time objective intelligibility, and subjectively using mean opinion score. The experimental results are carried out using a variety of training and testing models. The performance results show that the proposed GTFB-SDNN are robust to a variety of test situations and outperform existing methods.

OriginalsprogEngelsk
Artikelnummer108784
TidsskriftApplied Acoustics
Vol/bind194
ISSN0003-682X
DOI
StatusUdgivet - 15 jun. 2022

Bibliografisk note

Publisher Copyright:
© 2022 Elsevier Ltd

Fingeraftryk

Dyk ned i forskningsemnerne om 'Gammatone Filter Bank-Deep Neural Network-based Monaural speech enhancement for unseen conditions'. Sammen danner de et unikt fingeraftryk.

Citationsformater