ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Modelling speaking rate using a between frame distance metric

Andreas Tuerk, Steve Young

It is well known [5] that variations in speaking rate can account for a significant percentage of errors in practical speech recognition tasks. This is the result of the dynamic nature of speech which is not modelled properly by the standard HMM structure. This paper proposes an extension to the standard HMM that takes advantage of the information about the rate of speech that is contained in inter-frame transitions. The new model can be seen as a combination of Moore and Mealy type HMM's that has output probabilities attached to the transitions between states in addition to the conventional output probabilities attached to states. In this model fast and slow transitions are associated with additional hidden parameters. The output probabilities of the transitions are modelled with gamma distributions.


doi: 10.21437/Eurospeech.1999-108

Cite as: Tuerk, A., Young, S. (1999) Modelling speaking rate using a between frame distance metric. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 419-422, doi: 10.21437/Eurospeech.1999-108

@inproceedings{tuerk99_eurospeech,
  author={Andreas Tuerk and Steve Young},
  title={{Modelling speaking rate using a between frame distance metric}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={419--422},
  doi={10.21437/Eurospeech.1999-108}
}