EUROSPEECH 2003 - INTERSPEECH 2003
This paper describes an effective method for automatic speech unit segmentation. Based on hidden Markov models (HMM), an initial estimation of segmentation from the explicit phonetic transcription are processed by our local HMM training algorithm. With reliable silence boundaries obtained by a silence detector, this algorithm tries different training methods to overcome the insufficient training data problem. The performance is tested in a Mandarin TTS speech corpus. The results show that using this method, a 14.98% improvement is achieved in the boundary detection error rate (deviating larger than 20 ms).
Bibliographic reference. Zheng, Hong / Lu, Yiqing (2003): "Using both global and local hidden Markov models for automatic speech unit segmentation", In EUROSPEECH-2003, 1537-1540.