Performance Analysis of Low Complexity Fully Connected Neural Networks for Monaural Speech Enhancement

Himavanth Reddy; Asutosh Kar; Jan Østergaard

doi:10.1016/j.apacoust.2022.108627

Performance Analysis of Low Complexity Fully Connected Neural Networks for Monaural Speech Enhancement

Himavanth Reddy, Asutosh Kar, Jan Østergaard

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › peer review

6 Citationer (Scopus)

Abstract

We compare the run-time complexity of recent deep neural network (DNN) and non-DNN based monaural speech enhancement algorithms. Specifically, we consider fully connected, convolutional, and genetic-algorithm based DNNs and compare their performance to the image analysis technique, which is non-DNN based. It is demonstrated that for the same speech enhancement performance, a simple fully connected DNN has the lowest run-time computational complexity in terms of floating-point operations and execution time on a standard laptop. The objective indices used for the evaluation of the speech enhancement performance are the perceptual evaluation of speech quality and short-time objective intelligibility measures. In addition, the subjective intelligibility measures involved in the experiment are the modified rhyme test and the mean opinion score. Both stationary and non-stationary noise in addition to interfering speech is considered.

Originalsprog	Engelsk
Artikelnummer	108627
Tidsskrift	Applied Acoustics
Vol/bind	190
ISSN	0003-682X
DOI	https://doi.org/10.1016/j.apacoust.2022.108627
Status	Udgivet - 15 mar. 2022

Adgang til dokumentet

10.1016/j.apacoust.2022.108627

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Andre filer og links

Link to publication in Scopus

Citationsformater

@article{755e07b949f942faa2aafc291d827ea4,

title = "Performance Analysis of Low Complexity Fully Connected Neural Networks for Monaural Speech Enhancement",

abstract = "We compare the run-time complexity of recent deep neural network (DNN) and non-DNN based monaural speech enhancement algorithms. Specifically, we consider fully connected, convolutional, and genetic-algorithm based DNNs and compare their performance to the image analysis technique, which is non-DNN based. It is demonstrated that for the same speech enhancement performance, a simple fully connected DNN has the lowest run-time computational complexity in terms of floating-point operations and execution time on a standard laptop. The objective indices used for the evaluation of the speech enhancement performance are the perceptual evaluation of speech quality and short-time objective intelligibility measures. In addition, the subjective intelligibility measures involved in the experiment are the modified rhyme test and the mean opinion score. Both stationary and non-stationary noise in addition to interfering speech is considered.",

keywords = "Fully connected neural network, Low complexity architecture, Modified rhyme test, Speech enhancement, Speech intelligibility, Speech quality",

author = "Himavanth Reddy and Asutosh Kar and Jan {\O}stergaard",

year = "2022",

month = mar,

day = "15",

doi = "10.1016/j.apacoust.2022.108627",

language = "English",

volume = "190",

journal = "Applied Acoustics",

issn = "0003-682X",

publisher = "Pergamon Press",

}

TY - JOUR

T1 - Performance Analysis of Low Complexity Fully Connected Neural Networks for Monaural Speech Enhancement

AU - Reddy, Himavanth

AU - Kar, Asutosh

AU - Østergaard, Jan

PY - 2022/3/15

Y1 - 2022/3/15

N2 - We compare the run-time complexity of recent deep neural network (DNN) and non-DNN based monaural speech enhancement algorithms. Specifically, we consider fully connected, convolutional, and genetic-algorithm based DNNs and compare their performance to the image analysis technique, which is non-DNN based. It is demonstrated that for the same speech enhancement performance, a simple fully connected DNN has the lowest run-time computational complexity in terms of floating-point operations and execution time on a standard laptop. The objective indices used for the evaluation of the speech enhancement performance are the perceptual evaluation of speech quality and short-time objective intelligibility measures. In addition, the subjective intelligibility measures involved in the experiment are the modified rhyme test and the mean opinion score. Both stationary and non-stationary noise in addition to interfering speech is considered.

AB - We compare the run-time complexity of recent deep neural network (DNN) and non-DNN based monaural speech enhancement algorithms. Specifically, we consider fully connected, convolutional, and genetic-algorithm based DNNs and compare their performance to the image analysis technique, which is non-DNN based. It is demonstrated that for the same speech enhancement performance, a simple fully connected DNN has the lowest run-time computational complexity in terms of floating-point operations and execution time on a standard laptop. The objective indices used for the evaluation of the speech enhancement performance are the perceptual evaluation of speech quality and short-time objective intelligibility measures. In addition, the subjective intelligibility measures involved in the experiment are the modified rhyme test and the mean opinion score. Both stationary and non-stationary noise in addition to interfering speech is considered.

KW - Fully connected neural network

KW - Low complexity architecture

KW - Modified rhyme test

KW - Speech enhancement

KW - Speech intelligibility

KW - Speech quality

UR - http://www.scopus.com/inward/record.url?scp=85123358944&partnerID=8YFLogxK

U2 - 10.1016/j.apacoust.2022.108627

DO - 10.1016/j.apacoust.2022.108627

M3 - Journal article

SN - 0003-682X

VL - 190

JO - Applied Acoustics

JF - Applied Acoustics

M1 - 108627

ER -

Performance Analysis of Low Complexity Fully Connected Neural Networks for Monaural Speech Enhancement

Abstract

Adgang til dokumentet

AUB Link

Andre filer og links

Fingeraftryk

Citationsformater