ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Dynamic evidence models in a DBN phone recognizer

William Schuler, Tim Miller, Stephen Wu, Andrew Exley

This paper describes an implementation of a discriminative acoustical model - a Conditional Random Field (CRF) - within a Dynamic Bayes Net (DBN) formulation of a Hierarchic Hidden Markov Model (HHMM) phone recognizer. This CRF-DBN topology accounts for phone transition dynamics in conditional probability distributions over random variables associated with observed evidence, and therefore has less need for hidden variable states corresponding to transitions between phones, leaving more hypothesis space available for modeling higherlevel linguistic phenomena such syntax and semantics. The model also has the interesting property that it explicitly represents likely formant trajectories and formant targets of modeled phones in its random variable distributions, making it more linguistically transparent than models based on traditional HMMs with conditionally independent evidence variables. Results on the standard TIMIT phone recognition task show this CRF evidence model, even with a relatively simple first-order feature set, is competitive with standard HMMs and DBN variants using static Gaussian mixture models on MFCC features.

doi: 10.21437/Interspeech.2006-368

Cite as: Schuler, W., Miller, T., Wu, S., Exley, A. (2006) Dynamic evidence models in a DBN phone recognizer. Proc. Interspeech 2006, paper 1770-Tue3A1O.6, doi: 10.21437/Interspeech.2006-368

  author={William Schuler and Tim Miller and Stephen Wu and Andrew Exley},
  title={{Dynamic evidence models in a DBN phone recognizer}},
  booktitle={Proc. Interspeech 2006},
  pages={paper 1770-Tue3A1O.6},