Comparison of Forced-Alignment Speech Recognition and Humans for Generating Reference VAD

Ivan Kraljevski, Zheng-Hua Tan, Maria Paola Bissiri

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

6 Citationer (Scopus)

Abstract

This present paper aims to answer the question whether forced-alignment speech recognition can be used as an alternative to humans in generating reference Voice Activity Detection (VAD) transcriptions. An investigation of the level of agreement between automatic/manual VAD transcriptions and the reference ones produced by a human expert was carried out. Thereafter, statistical analysis was employed on the automatically produced and the collected manual transcriptions. Experimental results confirmed that forced-alignment speech recognition can provide accurate and consistent VAD labels.
OriginalsprogEngelsk
TitelINTERSPEECH-2015
Antal sider5
ForlagISCA
Publikationsdato2015
Sider2937-2941
StatusUdgivet - 2015
BegivenhedINTERSPEECH 2015 16th Annual Conference of the International Speech Communication Association - Dresden, Tyskland
Varighed: 6 sep. 201510 sep. 2015

Konference

KonferenceINTERSPEECH 2015 16th Annual Conference of the International Speech Communication Association
Land/OmrådeTyskland
ByDresden
Periode06/09/201510/09/2015
NavnINTERSPEECH
ISSN1990-9770

Fingeraftryk

Dyk ned i forskningsemnerne om 'Comparison of Forced-Alignment Speech Recognition and Humans for Generating Reference VAD'. Sammen danner de et unikt fingeraftryk.

Citationsformater