Do we need individual head-related transfer functions for vertical localization? The case study of a spectral notch distance metric

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

This paper deals with the issue of individualizing the head-related transfer function (HRTF) rendering process for auditory elevation perception. Is it possible to find a nonindividual, personalized HRTF set that allows a listener to have an equally accurate localization performance than with his/her individual HRTFs? We propose a psychoacoustically motivated, anthropometry based mismatch function between HRTF pairs that exploits the close relation between the listener's pinna geometry and localization cues. This is evaluated using an auditory model that computes a mapping between HRTF spectra and perceived spatial locations. Results on a large number of subjects in the center for image processing and integrated computing (CIPIC) and acoustics research institute (ARI) HRTF databases suggest that there exists a nonindividual HRTF set, which allows a listener to have an equally accurate vertical localization than with individual HRTFs. Furthermore, we find the optimal parameterization of the proposed mismatch function, i.e., the one that best reflects the information given by the auditory model. Our findings show that the selection procedure yields statistically significant improvements with respect to dummy-head HRTFs or random HRTF selection, with potentially high impact from an applicative point of view.
Close

Details

This paper deals with the issue of individualizing the head-related transfer function (HRTF) rendering process for auditory elevation perception. Is it possible to find a nonindividual, personalized HRTF set that allows a listener to have an equally accurate localization performance than with his/her individual HRTFs? We propose a psychoacoustically motivated, anthropometry based mismatch function between HRTF pairs that exploits the close relation between the listener's pinna geometry and localization cues. This is evaluated using an auditory model that computes a mapping between HRTF spectra and perceived spatial locations. Results on a large number of subjects in the center for image processing and integrated computing (CIPIC) and acoustics research institute (ARI) HRTF databases suggest that there exists a nonindividual HRTF set, which allows a listener to have an equally accurate vertical localization than with individual HRTFs. Furthermore, we find the optimal parameterization of the proposed mismatch function, i.e., the one that best reflects the information given by the auditory model. Our findings show that the selection procedure yields statistically significant improvements with respect to dummy-head HRTFs or random HRTF selection, with potentially high impact from an applicative point of view.
Original languageEnglish
JournalIEEE/ACM Transactions on Audio, Speech, and Language Processing
Volume26
Issue number7
Pages (from-to)1247 - 1260
ISSN2329-9290
DOI
Publication statusPublished - 2018
Publication categoryResearch
Peer-reviewedYes
ID: 273052521