8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


A Hidden Markov Model-Based Missing Data Imputation Approach

Yu Luo, Limin Du

Chinese Academy of Sciences, China

The accuracy of automatic speech recognizer degrades rapidly when speech was distorted by noise. Robustness against noise arises to be one of the challenge problems. In this paper, a hidden Markov model (HMM) based data imputation approach is presented to improve speech recognition robustness against noise at the front-end of recognizer. Considering the correlation between different filter-banks, the approach realizes missing data imputation by a HMM of L states, each of which has a Gaussian output distribution with full covariance matrix. "Missing" data in speech filter-bank vector sequences are recovered by MAP procedure from local optimal state path or marginal Viterbi decoded HMM state sequence. The potential of the approach was tested using speaker independent continuous mandarin speech recognizer with syllable-loop of perplexity 402 for both Gaussian and babble noises each at 6 different SNR levels ranging from 0dB to 25dB, showing a significant improvement in robustness against additive noises.

Full Paper

Bibliographic reference.  Luo, Yu / Du, Limin (2003): "A hidden Markov model-based missing data imputation approach", In EUROSPEECH-2003, 1765-1768.