SII-Based Speech Prepocessing for Intelligibility Improvement in Noise

Cees H. Taal, Jesper Jensen

Publikation: Bidrag til tidsskriftKonferenceartikel i tidsskriftForskningpeer review

457 Downloads (Pure)

Resumé

A linear time-invariant filter is designed in order to improve speech understanding when the speech is played back in a noisy environment. To accomplish this, the speech intelligibility index (SII) is maximized under the constraint that the speech energy is held constant. A nonlinear approximation is used for the SII such that a closed-form solution exists to the constrained optimization problem. The resulting filter is dependent both on the long-term average noise and speech spectrum and the global SNR and, in general, has a high-pass characteristic. In contrast to existing methods, the proposed filter sets certain frequency bands to zero when they do not contribute to intelligibility anymore. Experiments show large intelligibility improvements with the proposed method when used in stationary speech-shaped noise. However, it was also found that the method does not perform well for speech corrupted by a competing speaker. This is due to the fact that the SII is not a reliable intelligibility predictor for fluctuating noise sources. MATLAB code is provided.
OriginalsprogEngelsk
TidsskriftProceedings of the International Conference on Spoken Language Processing
Sider (fra-til)3582-3586
Antal sider6
ISSN1990-9772
StatusUdgivet - 2013
BegivenhedInterspeech 2013 - Lyon, Frankrig
Varighed: 25 aug. 201329 aug. 2013
http://www.interspeech2013.org/

Konference

KonferenceInterspeech 2013
LandFrankrig
ByLyon
Periode25/08/201329/08/2013
Internetadresse

Fingerprint

Speech intelligibility
Constrained optimization
MATLAB
Frequency bands

Citer dette

@inproceedings{7af4f4da8a904711b9a0d42e967fc0b5,
title = "SII-Based Speech Prepocessing for Intelligibility Improvement in Noise",
abstract = "A linear time-invariant filter is designed in order to improve speech understanding when the speech is played back in a noisy environment. To accomplish this, the speech intelligibility index (SII) is maximized under the constraint that the speech energy is held constant. A nonlinear approximation is used for the SII such that a closed-form solution exists to the constrained optimization problem. The resulting filter is dependent both on the long-term average noise and speech spectrum and the global SNR and, in general, has a high-pass characteristic. In contrast to existing methods, the proposed filter sets certain frequency bands to zero when they do not contribute to intelligibility anymore. Experiments show large intelligibility improvements with the proposed method when used in stationary speech-shaped noise. However, it was also found that the method does not perform well for speech corrupted by a competing speaker. This is due to the fact that the SII is not a reliable intelligibility predictor for fluctuating noise sources. MATLAB code is provided.",
author = "Taal, {Cees H.} and Jesper Jensen",
year = "2013",
language = "English",
pages = "3582--3586",
journal = "Proceedings of the International Conference on Spoken Language Processing",
issn = "1990-9772",
publisher = "International Speech Communication Association",

}

SII-Based Speech Prepocessing for Intelligibility Improvement in Noise. / Taal, Cees H. ; Jensen, Jesper.

I: Proceedings of the International Conference on Spoken Language Processing, 2013, s. 3582-3586.

Publikation: Bidrag til tidsskriftKonferenceartikel i tidsskriftForskningpeer review

TY - GEN

T1 - SII-Based Speech Prepocessing for Intelligibility Improvement in Noise

AU - Taal, Cees H.

AU - Jensen, Jesper

PY - 2013

Y1 - 2013

N2 - A linear time-invariant filter is designed in order to improve speech understanding when the speech is played back in a noisy environment. To accomplish this, the speech intelligibility index (SII) is maximized under the constraint that the speech energy is held constant. A nonlinear approximation is used for the SII such that a closed-form solution exists to the constrained optimization problem. The resulting filter is dependent both on the long-term average noise and speech spectrum and the global SNR and, in general, has a high-pass characteristic. In contrast to existing methods, the proposed filter sets certain frequency bands to zero when they do not contribute to intelligibility anymore. Experiments show large intelligibility improvements with the proposed method when used in stationary speech-shaped noise. However, it was also found that the method does not perform well for speech corrupted by a competing speaker. This is due to the fact that the SII is not a reliable intelligibility predictor for fluctuating noise sources. MATLAB code is provided.

AB - A linear time-invariant filter is designed in order to improve speech understanding when the speech is played back in a noisy environment. To accomplish this, the speech intelligibility index (SII) is maximized under the constraint that the speech energy is held constant. A nonlinear approximation is used for the SII such that a closed-form solution exists to the constrained optimization problem. The resulting filter is dependent both on the long-term average noise and speech spectrum and the global SNR and, in general, has a high-pass characteristic. In contrast to existing methods, the proposed filter sets certain frequency bands to zero when they do not contribute to intelligibility anymore. Experiments show large intelligibility improvements with the proposed method when used in stationary speech-shaped noise. However, it was also found that the method does not perform well for speech corrupted by a competing speaker. This is due to the fact that the SII is not a reliable intelligibility predictor for fluctuating noise sources. MATLAB code is provided.

M3 - Conference article in Journal

SP - 3582

EP - 3586

JO - Proceedings of the International Conference on Spoken Language Processing

JF - Proceedings of the International Conference on Spoken Language Processing

SN - 1990-9772

ER -