SII-Based Speech Prepocessing for Intelligibility Improvement in Noise

Cees H.  Taal; Jesper Jensen

SII-Based Speech Prepocessing for Intelligibility Improvement in Noise

Cees H. Taal, Jesper Jensen

Department of Electronic Systems

Research output: Contribution to journal › Conference article in Journal › Research › peer-review

574 Downloads (Pure)

Abstract

A linear time-invariant filter is designed in order to improve speech understanding when the speech is played back in a noisy environment. To accomplish this, the speech intelligibility index (SII) is maximized under the constraint that the speech energy is held constant. A nonlinear approximation is used for the SII such that a closed-form solution exists to the constrained optimization problem. The resulting filter is dependent both on the long-term average noise and speech spectrum and the global SNR and, in general, has a high-pass characteristic. In contrast to existing methods, the proposed filter sets certain frequency bands to zero when they do not contribute to intelligibility anymore. Experiments show large intelligibility improvements with the proposed method when used in stationary speech-shaped noise. However, it was also found that the method does not perform well for speech corrupted by a competing speaker. This is due to the fact that the SII is not a reliable intelligibility predictor for fluctuating noise sources. MATLAB code is provided.

Original language	English
Journal	Proceedings of the International Conference on Spoken Language Processing
Pages (from-to)	3582-3586
Number of pages	6
ISSN	1990-9772
Publication status	Published - 2013
Event	Interspeech 2013 - Lyon, France Duration: 25 Aug 2013 → 29 Aug 2013 http://www.interspeech2013.org/

Conference

Conference	Interspeech 2013
Country/Territory	France
City	Lyon
Period	25/08/2013 → 29/08/2013
Internet address	http://www.interspeech2013.org/

Access to Document

Taal(2013b)Submitted manuscript, 198 KB

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@inproceedings{7af4f4da8a904711b9a0d42e967fc0b5,

title = "SII-Based Speech Prepocessing for Intelligibility Improvement in Noise",

abstract = "A linear time-invariant filter is designed in order to improve speech understanding when the speech is played back in a noisy environment. To accomplish this, the speech intelligibility index (SII) is maximized under the constraint that the speech energy is held constant. A nonlinear approximation is used for the SII such that a closed-form solution exists to the constrained optimization problem. The resulting filter is dependent both on the long-term average noise and speech spectrum and the global SNR and, in general, has a high-pass characteristic. In contrast to existing methods, the proposed filter sets certain frequency bands to zero when they do not contribute to intelligibility anymore. Experiments show large intelligibility improvements with the proposed method when used in stationary speech-shaped noise. However, it was also found that the method does not perform well for speech corrupted by a competing speaker. This is due to the fact that the SII is not a reliable intelligibility predictor for fluctuating noise sources. MATLAB code is provided.",

author = "Taal, {Cees H.} and Jesper Jensen",

year = "2013",

language = "English",

pages = "3582--3586",

journal = "Proceedings of the International Conference on Spoken Language Processing",

issn = "1990-9772",

publisher = "International Speech Communication Association",

note = "Interspeech 2013 ; Conference date: 25-08-2013 Through 29-08-2013",

url = "http://www.interspeech2013.org/",

}

TY - GEN

T1 - SII-Based Speech Prepocessing for Intelligibility Improvement in Noise

AU - Taal, Cees H.

AU - Jensen, Jesper

PY - 2013

Y1 - 2013

N2 - A linear time-invariant filter is designed in order to improve speech understanding when the speech is played back in a noisy environment. To accomplish this, the speech intelligibility index (SII) is maximized under the constraint that the speech energy is held constant. A nonlinear approximation is used for the SII such that a closed-form solution exists to the constrained optimization problem. The resulting filter is dependent both on the long-term average noise and speech spectrum and the global SNR and, in general, has a high-pass characteristic. In contrast to existing methods, the proposed filter sets certain frequency bands to zero when they do not contribute to intelligibility anymore. Experiments show large intelligibility improvements with the proposed method when used in stationary speech-shaped noise. However, it was also found that the method does not perform well for speech corrupted by a competing speaker. This is due to the fact that the SII is not a reliable intelligibility predictor for fluctuating noise sources. MATLAB code is provided.

AB - A linear time-invariant filter is designed in order to improve speech understanding when the speech is played back in a noisy environment. To accomplish this, the speech intelligibility index (SII) is maximized under the constraint that the speech energy is held constant. A nonlinear approximation is used for the SII such that a closed-form solution exists to the constrained optimization problem. The resulting filter is dependent both on the long-term average noise and speech spectrum and the global SNR and, in general, has a high-pass characteristic. In contrast to existing methods, the proposed filter sets certain frequency bands to zero when they do not contribute to intelligibility anymore. Experiments show large intelligibility improvements with the proposed method when used in stationary speech-shaped noise. However, it was also found that the method does not perform well for speech corrupted by a competing speaker. This is due to the fact that the SII is not a reliable intelligibility predictor for fluctuating noise sources. MATLAB code is provided.

M3 - Conference article in Journal

SN - 1990-9772

SP - 3582

EP - 3586

JO - Proceedings of the International Conference on Spoken Language Processing

JF - Proceedings of the International Conference on Spoken Language Processing

T2 - Interspeech 2013

Y2 - 25 August 2013 through 29 August 2013

ER -

SII-Based Speech Prepocessing for Intelligibility Improvement in Noise

Abstract

Conference

Access to Document

AUB Link

Fingerprint

Cite this