Vocal Tract Length Perturbation for Text-Dependent Speaker Verification with Autoregressive Prediction Coding

Achintya Sarkar, Zheng-Hua Tan

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

9 Citationer (Scopus)
27 Downloads (Pure)

Abstract

In this letter, we propose a vocal tract length (VTL)perturbation method for text-dependent speaker verification (TD-SV), in which a set of TD-SV systems are trained, one foreach VTL factor, and score-level fusion is applied to make afinal decision. Next, we explore the bottleneck (BN) featureextracted by training deep neural networks with a self-supervisedlearning objective, autoregressive predictive coding (APC), forTD-SV and comapre it with the well-studied speaker-discriminantBN feature. The proposed VTL method is then applied toAPC and speaker-discriminant BN features. In the end, wecombine the VTL perturbation systems trained on MFCC andthe two BN features in the score domain. Experiments areperformed on the RedDots challenge 2016 database of TD-SVusing short utterances with Gaussian mixture model-universalbackground model and i-vector techniques. Results show theproposed methods significantly outperform the baselines.

OriginalsprogEngelsk
Artikelnummer9339931
TidsskriftI E E E Signal Processing Letters
Vol/bind28
Sider (fra-til)364-368
Antal sider5
ISSN1070-9908
DOI
StatusUdgivet - 28 jan. 2021

Fingeraftryk

Dyk ned i forskningsemnerne om 'Vocal Tract Length Perturbation for Text-Dependent Speaker Verification with Autoregressive Prediction Coding'. Sammen danner de et unikt fingeraftryk.

Citationsformater