ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Modeling the rate of speech by Markov processes on curves

Lawrence Saul, Mazin Rahim

We propose a statistical model for automatic speech recognition that relates variations in speaking rate to nonlinear warpings of time. The model describes a discrete random variable, s(t), that evolves as a function of the arc length traversed along a curve, parameterized by x(t). Since arc length does not depend on the rate at which a curve is traversed, this evolution gives rise to a family of Markov processes whose predictions, Pr[s|x], are invariant to nonlinear warpings of time. We describe the use of such models, known as Markov processes on curves (MPCs), for automatic speech recognition, where x are acoustic feature trajectories and s are phonetic transcriptions. On two tasks|recognizing New Jersey town names and connected alpha-digits|we find that MPCs yield lower word error rates than comparably trained hidden Markov models.


doi: 10.21437/Eurospeech.1999-107

Cite as: Saul, L., Rahim, M. (1999) Modeling the rate of speech by Markov processes on curves. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 415-418, doi: 10.21437/Eurospeech.1999-107

@inproceedings{saul99_eurospeech,
  author={Lawrence Saul and Mazin Rahim},
  title={{Modeling the rate of speech by Markov processes on curves}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={415--418},
  doi={10.21437/Eurospeech.1999-107}
}