Spectro-temporal modulation glimpsing for speech intelligibility prediction

Amin Edraki*, Wai Yip Chan, Jesper Jensen, Daniel Fogerty

*Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

3 Citations (Scopus)

Abstract

We compare two alternative speech intelligibility prediction algorithms: time-frequency glimpse proportion (GP) and spectro-temporal glimpsing index (STGI). Both algorithms hypothesize that listeners understand speech in challenging acoustic environments by “glimpsing” partially available information from degraded speech. GP defines glimpses as those time-frequency regions whose local signal-to-noise ratio is above a certain threshold and estimates intelligibility as the proportion of the time-frequency regions glimpsed. STGI, on the other hand, applies glimpsing to the spectro-temporal modulation (STM) domain and uses a similarity measure based on the normalized cross-correlation between the STM envelopes of the clean and degraded speech signals to estimate intelligibility as the proportion of the STM channels glimpsed. Our experimental results demonstrate that STGI extends the notion of glimpsing proportion to a wider range of distortions, including non-linear signal processing, and outperforms GP for the additive uncorrelated noise datasets we tested. Furthermore, the results show that spectro-temporal modulation analysis enables STGI to account for the effects of masker type on speech intelligibility, leading to superior performance over GP in modulated noise datasets.

Original languageEnglish
Article number108620
JournalHearing Research
Volume426
ISSN0378-5955
DOIs
Publication statusPublished - Dec 2022

Bibliographical note

Funding Information:
This work was partly (A.E. & W.-Y.C.) supported by the Natural Sciences and Engineering Research Council of Canada and the Demant Foundation. A portion of this work (D.F.) was also supported by the National Institutes of Health, National Institute on Deafness and Other Communication Disorders , Grant No. R01-DC015465 . The authors would like to thank the following researchers for providing intelligibility data: Carol Chermaz, Cees Taal, and Steven Van Kuyk.

Publisher Copyright:
© 2022 Elsevier B.V.

Keywords

  • Glimpsing
  • Spectro-temporal modulation
  • Speech intelligibility

Fingerprint

Dive into the research topics of 'Spectro-temporal modulation glimpsing for speech intelligibility prediction'. Together they form a unique fingerprint.

Cite this