Performance Analysis of Low Complexity Fully Connected Neural Networks for Monaural Speech Enhancement

Himavanth Reddy; Asutosh Kar; Jan Østergaard

doi:10.1016/j.apacoust.2022.108627

Performance Analysis of Low Complexity Fully Connected Neural Networks for Monaural Speech Enhancement

Himavanth Reddy, Asutosh Kar, Jan Østergaard

Research output: Contribution to journal › Journal article › Research › peer-review

6 Citations (Scopus)

Abstract

We compare the run-time complexity of recent deep neural network (DNN) and non-DNN based monaural speech enhancement algorithms. Specifically, we consider fully connected, convolutional, and genetic-algorithm based DNNs and compare their performance to the image analysis technique, which is non-DNN based. It is demonstrated that for the same speech enhancement performance, a simple fully connected DNN has the lowest run-time computational complexity in terms of floating-point operations and execution time on a standard laptop. The objective indices used for the evaluation of the speech enhancement performance are the perceptual evaluation of speech quality and short-time objective intelligibility measures. In addition, the subjective intelligibility measures involved in the experiment are the modified rhyme test and the mean opinion score. Both stationary and non-stationary noise in addition to interfering speech is considered.

Original language	English
Article number	108627
Journal	Applied Acoustics
Volume	190
ISSN	0003-682X
DOIs	https://doi.org/10.1016/j.apacoust.2022.108627
Publication status	Published - 15 Mar 2022

Keywords

Fully connected neural network
Low complexity architecture
Modified rhyme test
Speech enhancement
Speech intelligibility
Speech quality

Access to Document

10.1016/j.apacoust.2022.108627

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@article{755e07b949f942faa2aafc291d827ea4,

title = "Performance Analysis of Low Complexity Fully Connected Neural Networks for Monaural Speech Enhancement",

abstract = "We compare the run-time complexity of recent deep neural network (DNN) and non-DNN based monaural speech enhancement algorithms. Specifically, we consider fully connected, convolutional, and genetic-algorithm based DNNs and compare their performance to the image analysis technique, which is non-DNN based. It is demonstrated that for the same speech enhancement performance, a simple fully connected DNN has the lowest run-time computational complexity in terms of floating-point operations and execution time on a standard laptop. The objective indices used for the evaluation of the speech enhancement performance are the perceptual evaluation of speech quality and short-time objective intelligibility measures. In addition, the subjective intelligibility measures involved in the experiment are the modified rhyme test and the mean opinion score. Both stationary and non-stationary noise in addition to interfering speech is considered.",

keywords = "Fully connected neural network, Low complexity architecture, Modified rhyme test, Speech enhancement, Speech intelligibility, Speech quality",

author = "Himavanth Reddy and Asutosh Kar and Jan {\O}stergaard",

year = "2022",

month = mar,

day = "15",

doi = "10.1016/j.apacoust.2022.108627",

language = "English",

volume = "190",

journal = "Applied Acoustics",

issn = "0003-682X",

publisher = "Pergamon Press",

}

TY - JOUR

T1 - Performance Analysis of Low Complexity Fully Connected Neural Networks for Monaural Speech Enhancement

AU - Reddy, Himavanth

AU - Kar, Asutosh

AU - Østergaard, Jan

PY - 2022/3/15

Y1 - 2022/3/15

N2 - We compare the run-time complexity of recent deep neural network (DNN) and non-DNN based monaural speech enhancement algorithms. Specifically, we consider fully connected, convolutional, and genetic-algorithm based DNNs and compare their performance to the image analysis technique, which is non-DNN based. It is demonstrated that for the same speech enhancement performance, a simple fully connected DNN has the lowest run-time computational complexity in terms of floating-point operations and execution time on a standard laptop. The objective indices used for the evaluation of the speech enhancement performance are the perceptual evaluation of speech quality and short-time objective intelligibility measures. In addition, the subjective intelligibility measures involved in the experiment are the modified rhyme test and the mean opinion score. Both stationary and non-stationary noise in addition to interfering speech is considered.

AB - We compare the run-time complexity of recent deep neural network (DNN) and non-DNN based monaural speech enhancement algorithms. Specifically, we consider fully connected, convolutional, and genetic-algorithm based DNNs and compare their performance to the image analysis technique, which is non-DNN based. It is demonstrated that for the same speech enhancement performance, a simple fully connected DNN has the lowest run-time computational complexity in terms of floating-point operations and execution time on a standard laptop. The objective indices used for the evaluation of the speech enhancement performance are the perceptual evaluation of speech quality and short-time objective intelligibility measures. In addition, the subjective intelligibility measures involved in the experiment are the modified rhyme test and the mean opinion score. Both stationary and non-stationary noise in addition to interfering speech is considered.

KW - Fully connected neural network

KW - Low complexity architecture

KW - Modified rhyme test

KW - Speech enhancement

KW - Speech intelligibility

KW - Speech quality

UR - http://www.scopus.com/inward/record.url?scp=85123358944&partnerID=8YFLogxK

U2 - 10.1016/j.apacoust.2022.108627

DO - 10.1016/j.apacoust.2022.108627

M3 - Journal article

SN - 0003-682X

VL - 190

JO - Applied Acoustics

JF - Applied Acoustics

M1 - 108627

ER -

Performance Analysis of Low Complexity Fully Connected Neural Networks for Monaural Speech Enhancement

Abstract

Keywords

Access to Document

AUB Link

Other files and links

Fingerprint

Cite this