EUROSPEECH 2003 - INTERSPEECH 2003
This paper presents a novel approach to tone recognition in continuous Cantonese speech based on overlapped di-tone Gaussian mixture models (ODGMM). The ODGMM is designed with special consideration on the fact that Cantonese tone identification relies more on the relative pitch level than on the pitch contour. A di-tone unit covers a group of two consecutive tone occurrences. The tone sequence carried by a Cantonese utterance can be considered as the connection of such di-tone units. Adjacent di-tone units overlap with each other by exactly one tone. For each di-tone unit, a GMM is trained with a 10-dimensional feature vector that characterizes the F0 movement within the unit. In particular, the di-tone models capture the relative deviation between the F0 levels of the two tones. Viterbi decoding algorithm is adopted to search for the optimal tone sequence, under the phonological constraints on syllable-tone combination. Experimental results show the ODGMM approach significantly outperforms the previously proposed methods for tone recognition in continuous Cantonese speech.
Bibliographic reference. Qian, Yao / Lee, Tan / Li, Yujia (2003): "Overlapped di-tone modeling for tone recognition in continuous Cantonese speech", In EUROSPEECH-2003, 1845-1848.