ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

Modeling dynamics in connectionist speech recognition - the time index model

Yochai Konig, Nelson Morgan

We are experimenting with an approach to connectionist speech recognition that models the dynamics within a speech segment using temporal position as an explicit variable. Currently, the most common model for human speech production that is used in speech recognition is the Hidden Markov Model (HMM). However, HMMs suffer from well known limitations; most notably, the assumption that the observations generated in a given state are independent and identically distributed (i.i.d.). As an alternative, we are developing a time index model that explicitly conditions the emission probability of a state on the time index, where time index is defined as the number of frames since entering a state till the current frame. Thus, the proposed model does not require the i.i.d. assumption. Our pilot results suggest that the time-index approach can greatly reduce error if we have good information about the phoneme boundary location.


Cite as: Konig, Y., Morgan, N. (1994) Modeling dynamics in connectionist speech recognition - the time index model. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 1523-1526

@inproceedings{konig94_icslp,
  author={Yochai Konig and Nelson Morgan},
  title={{Modeling dynamics in connectionist speech recognition - the time index model}},
  year=1994,
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},
  pages={1523--1526}
}