Abstract
An i-vector is a low-dimensional fixed-length representation of a variable-length speech utterance, and is defined as the posterior mean of a latent variable conditioned on the observed feature sequence of an utterance. The assumption is that the prior for the latent variable is non-informative, since for homogeneous datasets there is no gain in generality in using an informative prior. This work shows that extracting i-vectors for a heterogeneous dataset, containing speech samples recorded from multiple sources, using informative priors instead is applicable, and leads to favorable results. Tests carried out on the NIST 2008 and 2010 Speaker Recognition Evaluation (SRE) dataset show that our proposed method beats three baselines: For the short2-short3 core-task in SRE'08, for the female and male cases, five and six respectively, out of eight common conditions were beaten, and for the core-core task in SRE'10, for both genders, five out of nine common conditions were beaten.
Originalsprog | Engelsk |
---|---|
Titel | IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015 |
Forlag | IEEE Signal Processing Society |
Publikationsdato | apr. 2015 |
Sider | 4185 - 4189 |
ISBN (Elektronisk) | 978-1-4673-6997-8 |
DOI | |
Status | Udgivet - apr. 2015 |
Begivenhed | 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015 - Brisbane, Australien Varighed: 19 apr. 2015 → 24 apr. 2015 Konferencens nummer: 2015 |
Konference
Konference | 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015 |
---|---|
Nummer | 2015 |
Land/Område | Australien |
By | Brisbane |
Periode | 19/04/2015 → 24/04/2015 |
Navn | I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings |
---|---|
ISSN | 1520-6149 |
Emneord
- i-vector, informative prior, total variability, source variation