Abstract
Speech signal enhancement achieves high-level performance in recent years using deep learning techniques. However, the deep learning technique in the speech enhancement algorithm degrades the performance of speech, particularly for unseen noises, unseen speakers and moreover, deep learning models are limited to the small number of speakers. Hence, we propose a Gammatone filterbank (GTFB) – simple deep neural network (SDNN) based speech enhancement algorithm to improve the quality of speech for three different unseen conditions. The use of GTFB gives a finer resolution in low-frequency regions of speech, and the SDNN model extracts a noisy GTFB frame as input and maps it to a clean speech GTFB frame. The experimental results are measured objectively using signal-noise-ratio, perceptual evaluation of speech quality, short time objective intelligibility, and subjectively using mean opinion score. The experimental results are carried out using a variety of training and testing models. The performance results show that the proposed GTFB-SDNN are robust to a variety of test situations and outperform existing methods.
Originalsprog | Engelsk |
---|---|
Artikelnummer | 108784 |
Tidsskrift | Applied Acoustics |
Vol/bind | 194 |
ISSN | 0003-682X |
DOI | |
Status | Udgivet - 15 jun. 2022 |
Bibliografisk note
Publisher Copyright:© 2022 Elsevier Ltd