Optimal near-end speech intelligibility improvement incorporating additive noise and late reverberation under an approximation of the short-time SII

Richard C. Hendriks; Joo B. Crespo; Jesper Jensen; Cees H. Taal

doi:10.1109/TASLP.2015.2409780

Optimal near-end speech intelligibility improvement incorporating additive noise and late reverberation under an approximation of the short-time SII

Richard C. Hendriks, Joo B. Crespo, Jesper Jensen, Cees H. Taal

Department of Electronic Systems

Research output: Contribution to journal › Journal article › Research › peer-review

30 Citations (Scopus)

Abstract

The presence of environmental additive noise in the vicinity of the user typically degrades the speech intelligibility of speech processing applications. This intelligibility loss can be compensated by properly preprocessing the speech signal prior to playout, often referred to as near-end speech enhancement. Although the majority of such algorithms focus primarily on the presence of additive noise, reverberation can also severely degrade intelligibility. In this paper we investigate how late reverberation and additive noise can be jointly taken into account in the near-end speech enhancement process. For this effort we use a recently presented approximation of the speech intelligibility index under a power constraint, which we optimize for speech degraded by both additive noise and late reverberation. The algorithm results in time-frequency dependent amplification factors that depend on both the additive noise power spectral density as well as the late reverberation energy. These amplification factors redistribute speech energy across frequency and perform a dynamic range compression. Experimental results using both instrumental intelligibility measures as well as intelligibility listening tests show that the proposed approach improves speech intelligibility over state-of-the-art reference methods when speech signals are degraded simultaneously by additive noise and reverberation. Speech intelligibility improvements in the order of 20% are observed.

Original language	English
Article number	2876407
Journal	I E E E Transactions on Audio, Speech and Language Processing
Volume	23
Issue number	5
Pages (from-to)	851-862
Number of pages	12
ISSN	1558-7916
DOIs	https://doi.org/10.1109/TASLP.2015.2409780
Publication status	Published - 1 May 2015

Keywords

additive noise
approximated speech intelligibility index (SII)
late reverberation
speech intelligibility

Access to Document

10.1109/TASLP.2015.2409780

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@article{2a2ab59b550740e799403dd15e86bd2d,

title = "Optimal near-end speech intelligibility improvement incorporating additive noise and late reverberation under an approximation of the short-time SII",

abstract = "The presence of environmental additive noise in the vicinity of the user typically degrades the speech intelligibility of speech processing applications. This intelligibility loss can be compensated by properly preprocessing the speech signal prior to playout, often referred to as near-end speech enhancement. Although the majority of such algorithms focus primarily on the presence of additive noise, reverberation can also severely degrade intelligibility. In this paper we investigate how late reverberation and additive noise can be jointly taken into account in the near-end speech enhancement process. For this effort we use a recently presented approximation of the speech intelligibility index under a power constraint, which we optimize for speech degraded by both additive noise and late reverberation. The algorithm results in time-frequency dependent amplification factors that depend on both the additive noise power spectral density as well as the late reverberation energy. These amplification factors redistribute speech energy across frequency and perform a dynamic range compression. Experimental results using both instrumental intelligibility measures as well as intelligibility listening tests show that the proposed approach improves speech intelligibility over state-of-the-art reference methods when speech signals are degraded simultaneously by additive noise and reverberation. Speech intelligibility improvements in the order of 20% are observed.",

keywords = "additive noise, approximated speech intelligibility index (SII), late reverberation, speech intelligibility",

author = "Hendriks, {Richard C.} and Crespo, {Joo B.} and Jesper Jensen and Taal, {Cees H.}",

year = "2015",

month = may,

day = "1",

doi = "10.1109/TASLP.2015.2409780",

language = "English",

volume = "23",

pages = "851--862",

journal = "I E E E Transactions on Audio, Speech and Language Processing",

issn = "1558-7916",

publisher = "IEEE Signal Processing Society",

number = "5",

}

Optimal near-end speech intelligibility improvement incorporating additive noise and late reverberation under an approximation of the short-time SII. / Hendriks, Richard C.; Crespo, Joo B.; Jensen, Jesper et al.
In: I E E E Transactions on Audio, Speech and Language Processing, Vol. 23, No. 5, 2876407, 01.05.2015, p. 851-862.

Research output: Contribution to journal › Journal article › Research › peer-review

TY - JOUR

T1 - Optimal near-end speech intelligibility improvement incorporating additive noise and late reverberation under an approximation of the short-time SII

AU - Hendriks, Richard C.

AU - Crespo, Joo B.

AU - Jensen, Jesper

AU - Taal, Cees H.

PY - 2015/5/1

Y1 - 2015/5/1

N2 - The presence of environmental additive noise in the vicinity of the user typically degrades the speech intelligibility of speech processing applications. This intelligibility loss can be compensated by properly preprocessing the speech signal prior to playout, often referred to as near-end speech enhancement. Although the majority of such algorithms focus primarily on the presence of additive noise, reverberation can also severely degrade intelligibility. In this paper we investigate how late reverberation and additive noise can be jointly taken into account in the near-end speech enhancement process. For this effort we use a recently presented approximation of the speech intelligibility index under a power constraint, which we optimize for speech degraded by both additive noise and late reverberation. The algorithm results in time-frequency dependent amplification factors that depend on both the additive noise power spectral density as well as the late reverberation energy. These amplification factors redistribute speech energy across frequency and perform a dynamic range compression. Experimental results using both instrumental intelligibility measures as well as intelligibility listening tests show that the proposed approach improves speech intelligibility over state-of-the-art reference methods when speech signals are degraded simultaneously by additive noise and reverberation. Speech intelligibility improvements in the order of 20% are observed.

AB - The presence of environmental additive noise in the vicinity of the user typically degrades the speech intelligibility of speech processing applications. This intelligibility loss can be compensated by properly preprocessing the speech signal prior to playout, often referred to as near-end speech enhancement. Although the majority of such algorithms focus primarily on the presence of additive noise, reverberation can also severely degrade intelligibility. In this paper we investigate how late reverberation and additive noise can be jointly taken into account in the near-end speech enhancement process. For this effort we use a recently presented approximation of the speech intelligibility index under a power constraint, which we optimize for speech degraded by both additive noise and late reverberation. The algorithm results in time-frequency dependent amplification factors that depend on both the additive noise power spectral density as well as the late reverberation energy. These amplification factors redistribute speech energy across frequency and perform a dynamic range compression. Experimental results using both instrumental intelligibility measures as well as intelligibility listening tests show that the proposed approach improves speech intelligibility over state-of-the-art reference methods when speech signals are degraded simultaneously by additive noise and reverberation. Speech intelligibility improvements in the order of 20% are observed.

KW - additive noise

KW - approximated speech intelligibility index (SII)

KW - late reverberation

KW - speech intelligibility

U2 - 10.1109/TASLP.2015.2409780

DO - 10.1109/TASLP.2015.2409780

M3 - Journal article

SN - 1558-7916

VL - 23

SP - 851

EP - 862

JO - I E E E Transactions on Audio, Speech and Language Processing

JF - I E E E Transactions on Audio, Speech and Language Processing

IS - 5

M1 - 2876407

ER -

Optimal near-end speech intelligibility improvement incorporating additive noise and late reverberation under an approximation of the short-time SII

Abstract

Keywords

Access to Document

AUB Link

Fingerprint

Cite this