Abstract
We compare the run-time complexity of recent deep neural network (DNN) and non-DNN based monaural speech enhancement algorithms. Specifically, we consider fully connected, convolutional, and genetic-algorithm based DNNs and compare their performance to the image analysis technique, which is non-DNN based. It is demonstrated that for the same speech enhancement performance, a simple fully connected DNN has the lowest run-time computational complexity in terms of floating-point operations and execution time on a standard laptop. The objective indices used for the evaluation of the speech enhancement performance are the perceptual evaluation of speech quality and short-time objective intelligibility measures. In addition, the subjective intelligibility measures involved in the experiment are the modified rhyme test and the mean opinion score. Both stationary and non-stationary noise in addition to interfering speech is considered.
Original language | English |
---|---|
Article number | 108627 |
Journal | Applied Acoustics |
Volume | 190 |
ISSN | 0003-682X |
DOIs | |
Publication status | Published - 15 Mar 2022 |
Keywords
- Fully connected neural network
- Low complexity architecture
- Modified rhyme test
- Speech enhancement
- Speech intelligibility
- Speech quality