ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Precision of phoneme boundaries derived using hidden Markov models

Ladan Baghai-Ravary, Greg Kochanski, John Coleman

Some phoneme boundaries correspond to abrupt changes in the acoustic signal. Others are less clear-cut because the transition from one phoneme to the next is gradual.

This paper compares the phoneme boundaries identified by a large number of different alignment systems, using different signal representations and Hidden Markov Model structures. The variability of the different boundaries is analysed statistically, with the boundaries grouped in terms of the broad phonetic classes of the respective phonemes.

The mutual consistency between the boundaries from the various systems is analysed to identify which classes of phoneme boundary can be identified reliably by an automatic labelling system, and which are ill-defined and ambiguous.

The results presented here provide a starting point for future development of techniques for objective comparisons between systems without giving undue weight to variations in those phoneme boundaries which are inherently ambiguous. Such techniques should improve the efficiency with which new alignment and HMM training algorithms can be developed.

doi: 10.21437/Interspeech.2009-44

Cite as: Baghai-Ravary, L., Kochanski, G., Coleman, J. (2009) Precision of phoneme boundaries derived using hidden Markov models. Proc. Interspeech 2009, 2879-2882, doi: 10.21437/Interspeech.2009-44

  author={Ladan Baghai-Ravary and Greg Kochanski and John Coleman},
  title={{Precision of phoneme boundaries derived using hidden Markov models}},
  booktitle={Proc. Interspeech 2009},