8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Improved HMM/SVM Methods for Automatic Phoneme Segmentation

Jen-Wei Kuo, Hung-Yi Lo, Hsin-Min Wang

Academia Sinica, Taiwan

This paper presents improved HMM/SVM methods for a two-stage phoneme segmentation framework, which tries to imitate the human phoneme segmentation process. The first stage performs hidden Markov model (HMM) forced alignment according to the minimum boundary error (MBE) criterion. The objective is to align a phoneme sequence of a speech utterance with its acoustic signal counterpart based on MBE-trained HMMs and explicit phoneme duration models. The second stage uses the support vector machine (SVM) method to refine the hypothesized phoneme boundaries derived by HMM-based forced alignment. The efficacy of the proposed framework has been validated on two speech databases: the TIMIT English database and the MATBN Mandarin Chinese database.

Full Paper

Bibliographic reference.  Kuo, Jen-Wei / Lo, Hung-Yi / Wang, Hsin-Min (2007): "Improved HMM/SVM methods for automatic phoneme segmentation", In INTERSPEECH-2007, 2057-2060.