Abstract
Speaker identification research faces challenges due to mismatched training and test conditions, arising out of several factors. Non-electronic voice disguise is one of such factor and is commonly seen in crimes. This paper presents a study of the effect of three different types of voice disguises, taken from the CHAINS speech corpus for the speaker identification accuracy. Out of the three voice disguises, two are variants of imitative style, namely, synchronous and repetitive synchronous imitation, and one is the fast speaking style. Different variants of multistyle training to increase the speaker identification accuracy are investigated in this paper. The manner in which the different speaking style’s speech examples are used for multistyle training plays an important role in the speaker identification accuracy. Further, a fusion of two multistyle training at the decision level is proposed. Experimental results show the overall better and more stable performance of the fusion multistyle training, over single style training and the investigated multistyle trainings, across the different voice disguises.
Originalsprog | Engelsk |
---|---|
Titel | ICCo5-2013 Conference Proceedings |
Antal sider | 6 |
Forlag | ICCo5 |
Publikationsdato | dec. 2013 |
Status | Udgivet - dec. 2013 |
Begivenhed | The First International Conference on Communications, Connectivity, Convergence, Content and Cooperation (IC5) - Mumbai, Indien Varighed: 16 dec. 2013 → 19 dec. 2013 |
Konference
Konference | The First International Conference on Communications, Connectivity, Convergence, Content and Cooperation (IC5) |
---|---|
Land/Område | Indien |
By | Mumbai |
Periode | 16/12/2013 → 19/12/2013 |