From the eardrum to the auditory cortex, where acoustic stimuli are decoded, there are several stages of auditory processing and transmission where information may potentially be lost. In this paper, we aim at quantifying the total information loss in the human auditory system by using information theoretic tools. To do so, we consider a speech communication model, where words are uttered and sent through a noisy channel, and then received and processed by a human listener. We define a notion of information loss that is related to the human word recognition rate. To assess the word recognition rate of humans, we conduct a closed-vocabulary intelligibility test. We derive upper and lower bounds on the information loss. Simulations reveal that the bounds are tight and we observe that the information loss in the human auditory system increases as the signal to noise ratio (SNR) decreases. Our framework also allows us to study whether humans are optimal in terms of speech perception in a noisy environment. Toward that end, we derive optimal classifiers and compare the human and machine performance in terms of information loss and word recognition rate. We observe a higher information loss and lower word recognition rate for humans compared to the optimal classifiers. In fact, depending on the SNR, the machine classifier may outperform humans by as much as 8 dB. This implies that for the speech-in-stationary-noise setup considered here, the human auditory system is suboptimal for recognizing noisy words.
|Journal||IEEE/ACM Transactions on Audio, Speech, and Language Processing|
|Number of pages||10|
|Publication status||Published - Mar 2019|
- Gaussian mixture model
- Human auditory system
- maximum likelihood classifier
- mutual information