5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Additional Use of Phoneme Duration Hypotheses in Automatic Speech Segmentation

Karlheinz Stöber, Wolfgang Hess

IKP, Bonn University, Germany

We describe a new approach for speaker independent automatic phoneme alignment. Typical algorithms for this task use only phoneme-to-frame similarity measures which are somehow maximised or minimised. In addition to such similarity measures, we use phoneme duration hypotheses generated by the speech synthesis system HADIFIX. For algorithms based on dynamic programming, it is difficult to use these duration hypotheses, so we create a cost-function consisting of phoneme-to-frame and segment-to-duration hypotheses similarity measures and minimise this cost-function by a Genetic Algorithm. The results show that the accuracy of automatically determined phoneme boundaries increases. This accounts especially for speakers not used in the training phase.

Full Paper

Bibliographic reference.  Stöber, Karlheinz / Hess, Wolfgang (1998): "Additional use of phoneme duration hypotheses in automatic speech segmentation", In ICSLP-1998, paper 0239.