Joint variable frame rate and length analysis for speech recognition under adverse conditions

Zheng-Hua Tan, Ivan Kraljevski

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

8 Citationer (Scopus)

Abstract

This paper presents a method that combines variable frame length and rate analysis for speech recognition in noisy environments, together with an investigation of the effect of different frame lengths on speech recognition performance. The method adopts frame selection using an a posteriori signal-to-noise (SNR) ratio weighted energy distance and increases the length of the selected frames, according to the number of non-selected preceding frames. It assigns a higher frame rate and a normal frame length to a rapidly changing and high SNR region of a speech signal, and a lower frame rate and an increased frame length to a steady or low SNR region. The speech recognition results show that the proposed variable frame rate and length method outperforms fixed frame rate and length analysis, as well as standalone variable frame rate analysis in terms of noise-robustness.
OriginalsprogEngelsk
TidsskriftComputers & Electrical Engineering
Vol/bind40
Udgave nummer7
Sider (fra-til)2139-2149
ISSN0045-7906
DOI
StatusUdgivet - okt. 2014

Fingeraftryk

Dyk ned i forskningsemnerne om 'Joint variable frame rate and length analysis for speech recognition under adverse conditions'. Sammen danner de et unikt fingeraftryk.

Citationsformater