Minimum Mean-Square Error Estimation of Mel-Frequency Cepstral Features: A Theoretically Consistent Approach

Research output: Contribution to journalJournal articleResearchpeer-review

19 Citations (Scopus)

Abstract

In this work we consider the problem of feature enhancement for noise-robust automatic speech recognition (ASR). We propose a method for minimum mean-square error (MMSE) estimation of mel-frequency cepstral features, which is based on a minimum number of well-established, theoretically consistent statistical assumptions. More specifically, the method belongs to the class of methods relying on the statistical framework proposed in Ephraim and Malah’s original work [1]. The method is general in that it allows MMSE estimation of mel-frequency cepstral coefficients (MFCC’s), cepstral-mean subtracted (CMS-) MFCC’s, autoregressive-moving-average (ARMA)-filtered CMSMFCC’s, velocity, and acceleration coefficients. In addition, the method is easily modified to take into account other compressive non-linearities than the logarithm traditionally used for MFCC computation. In terms of MFCC estimation performance, as measured by MFCC mean-square error, the proposed method shows performance, which is identical to or better than other state-of-the-art methods. In terms of ASR performance, no statistical difference could be found between the proposed method and the state-of-the-art methods. We conclude that existing state-of-the-art MFCC feature enhancement algorithms within this class of algorithms, while theoretically suboptimal or based on theoretically inconsistent assumptions, perform close to optimally in the MMSE sense.
Original languageEnglish
JournalI E E E Transactions on Audio, Speech and Language Processing
Volume23
Issue number1
Pages (from-to)186 - 197
ISSN1558-7916
DOIs
Publication statusPublished - Jan 2015

Fingerprint

Dive into the research topics of 'Minimum Mean-Square Error Estimation of Mel-Frequency Cepstral Features: A Theoretically Consistent Approach'. Together they form a unique fingerprint.

Cite this