10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

A One-Step Tone Recognition Approach Using MSD-HMM for Continuous Speech

Changliang Liu, Fengpei Ge, Fuping Pan, Bin Dong, Yonghong Yan

Chinese Academy of Sciences, China

There are two types of methods for tone recognition of continuous speech: one-step and two-step approaches. Two-step approaches need to identify the syllable boundaries firstly, while one-step approaches do not. Previous studies mostly focus on two-step approaches. In this paper, a one-step approach using Multi-space distribution HMM (MSD-HMM) is investigated. The F0, which only exists in voiced speech, is modeled by MSD-HMM. Then, a tonal syllable network is built based on the reference and Viterbi search is carried out on it to find the best tone sequence. Two modifications to the conventional tri-phone HMM models are investigated: tone-based context expansion and syllable-based model units. The experimental results proved that tone-based context information is more important for tone recognition and syllable-based HMM models are much better than phone-based ones. The final tone correct rate result is 88.8%, which is much higher than the state-of-the-art two-step approaches.

Full Paper

Bibliographic reference.  Liu, Changliang / Ge, Fengpei / Pan, Fuping / Dong, Bin / Yan, Yonghong (2009): "A one-step tone recognition approach using MSD-HMM for continuous speech", In INTERSPEECH-2009, 3015-3018.