TY - JOUR
T1 - Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones
AU - Sahidullah, Md
AU - Thomsen, Dennis Alexander Lehmann
AU - Hautamaki, Rosa Gonzalez
AU - Kinnunen, Tomi
AU - Tan, Zheng-Hua
AU - Parts, Robert
AU - Pitkänen, Martti
PY - 2018
Y1 - 2018
N2 - While having a wide range of applications, automatic speaker verification (ASV) systems are vulnerable to spoofing attacks, in particular, replay attacks that are effective and easy to implement. Most prior work on detecting replay attacks uses audio from a single acousticmicrophone only, leading to difficulties in detecting high-end replay attacks close to indistinguishable from live human speech. In this paper, we study the use of a special body-conducted sensor, throat microphone (TM), for combined voice liveness detection (VLD) and ASV in order to improve both robustness and security of ASV against replay attacks.We first investigate the possibility and methods of attacking a TM-based ASV system, followed by a pilot data collection. Second, we study the use of spectral features for VLD using both single-channel and dualchannel ASV systems. We carry out speaker verification experiments using Gaussian mixture model with universal background model (GMM-UBM) and i-vector based systems on a dataset of 38 speakers collected by us. We have achieved considerable improvement in recognition accuracy, with the use of dual-microphone setup. In experiments with noisy test speech, the false acceptance rate (FAR) of the dual-microphone GMM-UBM based system for recorded speech reduces from 69.69% to 18.75%. The FAR of replay condition further drops to 0% when this dual-channel ASV system is integrated with the new dual-channel voice liveness detector.
AB - While having a wide range of applications, automatic speaker verification (ASV) systems are vulnerable to spoofing attacks, in particular, replay attacks that are effective and easy to implement. Most prior work on detecting replay attacks uses audio from a single acousticmicrophone only, leading to difficulties in detecting high-end replay attacks close to indistinguishable from live human speech. In this paper, we study the use of a special body-conducted sensor, throat microphone (TM), for combined voice liveness detection (VLD) and ASV in order to improve both robustness and security of ASV against replay attacks.We first investigate the possibility and methods of attacking a TM-based ASV system, followed by a pilot data collection. Second, we study the use of spectral features for VLD using both single-channel and dualchannel ASV systems. We carry out speaker verification experiments using Gaussian mixture model with universal background model (GMM-UBM) and i-vector based systems on a dataset of 38 speakers collected by us. We have achieved considerable improvement in recognition accuracy, with the use of dual-microphone setup. In experiments with noisy test speech, the false acceptance rate (FAR) of the dual-microphone GMM-UBM based system for recorded speech reduces from 69.69% to 18.75%. The FAR of replay condition further drops to 0% when this dual-channel ASV system is integrated with the new dual-channel voice liveness detector.
KW - Automatic speaker verification
KW - anti-spoofing
KW - replay attack
KW - throat microphone
KW - two-channel countermeasure
KW - voice liveness detection
UR - http://www.scopus.com/inward/record.url?scp=85031788681&partnerID=8YFLogxK
U2 - 10.1109/TASLP.2017.2760243
DO - 10.1109/TASLP.2017.2760243
M3 - Journal article
SN - 2329-9290
VL - 26
SP - 44
EP - 56
JO - IEEE/ACM Transactions on Audio, Speech, and Language Processing
JF - IEEE/ACM Transactions on Audio, Speech, and Language Processing
IS - 1
ER -