Minimum Processing Near-End Listening Enhancement

Andreas Jonas Fuglsig; Jesper Jensen; Zheng Hua Tan; Lars Sondergaard Bertelsen; Jens Christian Lindof; Jan Ostergaard

doi:10.48550/arXiv.2210.17154

Minimum Processing Near-End Listening Enhancement

Andreas Jonas Fuglsig^*, Jesper Jensen, Zheng Hua Tan, Lars Sondergaard Bertelsen, Jens Christian Lindof, Jan Ostergaard

^*Corresponding author for this work

Research output: Contribution to journal › Journal article › Research › peer-review

1 Citation (Scopus)

32 Downloads (Pure)

Abstract

The intelligibility and quality of speech from a mobile phone or public announcement system are often affected by background noise in the listening environment. By pre-processing the speech signal it is possible to improve the speech intelligibility and quality - this is known as near-end listening enhancement (NLE). Although, existing NLE techniques are able to greatly increase intelligibility in harsh noise environments, in favorable noise conditions the intelligibility of speech reaches a ceiling where it cannot be further enhanced. Actually, the focus of existing methods solely on improving the intelligibility causes unnecessary processing of the speech signal and leads to speech distortions and quality degradations. In this article, we provide a new rationale for NLE, where the target speech is minimally processed in terms of a processing penalty, provided that a certain performance constraint, e.g., intelligibility, is satisfied. We present a closed-form solution for the case where the performance criterion is an intelligibility estimator based on the approximated speech intelligibility index and the processing penalty is the mean-square error between the processed and the clean speech. This produces an NLE method that adapts to changing noise conditions via a simple gain rule by limiting the processing to the minimum necessary to achieve a desired intelligibility, while at the same time focusing on quality in favorable noise situations by minimizing the amount of speech distortions. Through simulation studies, we show the proposed method attains speech quality on par or better than existing methods in both objective measurements and subjective listening tests, whilst still sustaining objective speech intelligibility performance on par with existing methods.

Original language	English
Journal	IEEE/ACM Transactions on Audio, Speech, and Language Processing
Volume	31
Pages (from-to)	2233-2245
Number of pages	13
ISSN	2329-9290
DOIs	https://doi.org/10.48550/arXiv.2210.17154 https://doi.org/10.1109/TASLP.2023.3282094
Publication status	Published - 5 Jun 2023

Bibliographical note

Publisher Copyright:
© 2014 IEEE.

Keywords

adaptive
approximated speech intelligibility index
Minimum processing
near-end listening enhancement
optimization
speech intelligibility
speech quality

Access to Document

10.48550/arXiv.2210.17154Licence: Other
10.1109/TASLP.2023.3282094Licence: CC BY-NC-ND 4.0

2210.17154v1Accepted author manuscript, 700 KB
Open Access articleFinal published version, 856 KBLicence: CC BY-NC-ND 4.0

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@article{cdee6519f44441ed999dbbd5245771c2,

title = "Minimum Processing Near-End Listening Enhancement",

abstract = "The intelligibility and quality of speech from a mobile phone or public announcement system are often affected by background noise in the listening environment. By pre-processing the speech signal it is possible to improve the speech intelligibility and quality - this is known as near-end listening enhancement (NLE). Although, existing NLE techniques are able to greatly increase intelligibility in harsh noise environments, in favorable noise conditions the intelligibility of speech reaches a ceiling where it cannot be further enhanced. Actually, the focus of existing methods solely on improving the intelligibility causes unnecessary processing of the speech signal and leads to speech distortions and quality degradations. In this article, we provide a new rationale for NLE, where the target speech is minimally processed in terms of a processing penalty, provided that a certain performance constraint, e.g., intelligibility, is satisfied. We present a closed-form solution for the case where the performance criterion is an intelligibility estimator based on the approximated speech intelligibility index and the processing penalty is the mean-square error between the processed and the clean speech. This produces an NLE method that adapts to changing noise conditions via a simple gain rule by limiting the processing to the minimum necessary to achieve a desired intelligibility, while at the same time focusing on quality in favorable noise situations by minimizing the amount of speech distortions. Through simulation studies, we show the proposed method attains speech quality on par or better than existing methods in both objective measurements and subjective listening tests, whilst still sustaining objective speech intelligibility performance on par with existing methods.",

keywords = "adaptive, approximated speech intelligibility index, Minimum processing, near-end listening enhancement, optimization, speech intelligibility, speech quality",

author = "Fuglsig, {Andreas Jonas} and Jesper Jensen and Tan, {Zheng Hua} and Bertelsen, {Lars Sondergaard} and Lindof, {Jens Christian} and Jan Ostergaard",

note = "Publisher Copyright: {\textcopyright} 2014 IEEE.",

year = "2023",

month = jun,

day = "5",

doi = "10.48550/arXiv.2210.17154",

language = "English",

volume = "31",

pages = "2233--2245",

journal = "IEEE/ACM Transactions on Audio, Speech, and Language Processing",

issn = "2329-9290",

publisher = "IEEE Signal Processing Society",

}

TY - JOUR

T1 - Minimum Processing Near-End Listening Enhancement

AU - Fuglsig, Andreas Jonas

AU - Jensen, Jesper

AU - Tan, Zheng Hua

AU - Bertelsen, Lars Sondergaard

AU - Lindof, Jens Christian

AU - Ostergaard, Jan

PY - 2023/6/5

Y1 - 2023/6/5

N2 - The intelligibility and quality of speech from a mobile phone or public announcement system are often affected by background noise in the listening environment. By pre-processing the speech signal it is possible to improve the speech intelligibility and quality - this is known as near-end listening enhancement (NLE). Although, existing NLE techniques are able to greatly increase intelligibility in harsh noise environments, in favorable noise conditions the intelligibility of speech reaches a ceiling where it cannot be further enhanced. Actually, the focus of existing methods solely on improving the intelligibility causes unnecessary processing of the speech signal and leads to speech distortions and quality degradations. In this article, we provide a new rationale for NLE, where the target speech is minimally processed in terms of a processing penalty, provided that a certain performance constraint, e.g., intelligibility, is satisfied. We present a closed-form solution for the case where the performance criterion is an intelligibility estimator based on the approximated speech intelligibility index and the processing penalty is the mean-square error between the processed and the clean speech. This produces an NLE method that adapts to changing noise conditions via a simple gain rule by limiting the processing to the minimum necessary to achieve a desired intelligibility, while at the same time focusing on quality in favorable noise situations by minimizing the amount of speech distortions. Through simulation studies, we show the proposed method attains speech quality on par or better than existing methods in both objective measurements and subjective listening tests, whilst still sustaining objective speech intelligibility performance on par with existing methods.

AB - The intelligibility and quality of speech from a mobile phone or public announcement system are often affected by background noise in the listening environment. By pre-processing the speech signal it is possible to improve the speech intelligibility and quality - this is known as near-end listening enhancement (NLE). Although, existing NLE techniques are able to greatly increase intelligibility in harsh noise environments, in favorable noise conditions the intelligibility of speech reaches a ceiling where it cannot be further enhanced. Actually, the focus of existing methods solely on improving the intelligibility causes unnecessary processing of the speech signal and leads to speech distortions and quality degradations. In this article, we provide a new rationale for NLE, where the target speech is minimally processed in terms of a processing penalty, provided that a certain performance constraint, e.g., intelligibility, is satisfied. We present a closed-form solution for the case where the performance criterion is an intelligibility estimator based on the approximated speech intelligibility index and the processing penalty is the mean-square error between the processed and the clean speech. This produces an NLE method that adapts to changing noise conditions via a simple gain rule by limiting the processing to the minimum necessary to achieve a desired intelligibility, while at the same time focusing on quality in favorable noise situations by minimizing the amount of speech distortions. Through simulation studies, we show the proposed method attains speech quality on par or better than existing methods in both objective measurements and subjective listening tests, whilst still sustaining objective speech intelligibility performance on par with existing methods.

KW - adaptive

KW - approximated speech intelligibility index

KW - Minimum processing

KW - near-end listening enhancement

KW - optimization

KW - speech intelligibility

KW - speech quality

UR - http://www.scopus.com/inward/record.url?scp=85161587364&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2210.17154

DO - 10.48550/arXiv.2210.17154

M3 - Journal article

SN - 2329-9290

VL - 31

SP - 2233

EP - 2245

JO - IEEE/ACM Transactions on Audio, Speech, and Language Processing

JF - IEEE/ACM Transactions on Audio, Speech, and Language Processing

ER -

Minimum Processing Near-End Listening Enhancement

Abstract

Bibliographical note

Keywords

Access to Document

AUB Link

Other files and links

Fingerprint

Minimum Processing Near-End Listening Enhancement

Cite this