Abstract
This paper describes the speaker identification (SID) system developed
by the Patrol team for the first phase of the DARPA RATS (Robust
Automatic Transcription of Speech) program, which seeks to
advance state of the art detection capabilities on audio from highly
degraded communication channels. We present results using multiple
SID systems differing mainly in the algorithm used for voice
activity detection (VAD) and feature extraction. We show that (a)
unsupervised VAD performs as well supervised methods in terms
of downstream SID performance, (b) noise-robust feature extraction
methods such as CFCCs out-perform MFCC front-ends on noisy audio,
and (c) fusion of multiple systems provides 24% relative improvement
in EER compared to the single best system when using a
novel SVM-based fusion algorithm that uses side information such
as gender, language, and channel id.
by the Patrol team for the first phase of the DARPA RATS (Robust
Automatic Transcription of Speech) program, which seeks to
advance state of the art detection capabilities on audio from highly
degraded communication channels. We present results using multiple
SID systems differing mainly in the algorithm used for voice
activity detection (VAD) and feature extraction. We show that (a)
unsupervised VAD performs as well supervised methods in terms
of downstream SID performance, (b) noise-robust feature extraction
methods such as CFCCs out-perform MFCC front-ends on noisy audio,
and (c) fusion of multiple systems provides 24% relative improvement
in EER compared to the single best system when using a
novel SVM-based fusion algorithm that uses side information such
as gender, language, and channel id.
Originalsprog | Engelsk |
---|---|
Titel | Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on |
Antal sider | 5 |
Forlag | IEEE |
Publikationsdato | 2013 |
Sider | 6768 - 6772 |
ISBN (Trykt) | 978-1-4799-0356-6 |
DOI | |
Status | Udgivet - 2013 |
Begivenhed | 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing - Vancouver, Canada Varighed: 26 maj 2013 → 31 maj 2013 Konferencens nummer: 38 |
Konference
Konference | 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing |
---|---|
Nummer | 38 |
Land/Område | Canada |
By | Vancouver |
Periode | 26/05/2013 → 31/05/2013 |
Navn | I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings |
---|---|
ISSN | 1520-6149 |