ISCA Archive ASR 2000
ISCA Archive ASR 2000

Hidden-articulator Markov models for speech recognition

Matt Richardson, Jeff Bilmes, Chris Diorio

In traditional speech recognition using Hidden Markov Models (HMMs), each state represents an acoustic portion of a phoneme. We explore the concept of an articulator based HMM, where each state represents a particular articulatory configuration [Erler 1996]. In this paper, we present a novel articulatory feature mapping and a new technique for model initialization. In addition, we use diphone modeling which allows context dependent training of transition probabilities. Our goal is to confirm that articulatory knowledge can assist speech recognition. We demonstrate this by showing that our mapping of articulatory configurations to phonemes performs better than random mappings. Furthermore, we demonstrate the practicality of the model by showing that, in combination with a standard model, a 12-22% relative word error rate decrease occurs relative to the standard model alone.

K. Erler and G.H. Freeman (1996). "An HMM-based speech recognizer using overlapping articulatory features," J. Acoust. Soc. Am. 100, pp. 2500-2513

Cite as: Richardson, M., Bilmes, J., Diorio, C. (2000) Hidden-articulator Markov models for speech recognition. Proc. ASR2000 - Automatic Speech Recognition: Challenges for the New Millenium, 133-139

  author={Matt Richardson and Jeff Bilmes and Chris Diorio},
  title={{Hidden-articulator Markov models for speech recognition}},
  booktitle={Proc. ASR2000 - Automatic Speech Recognition: Challenges for the New Millenium},