Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Incorporating Tone Information into Cantonese Large-Vocabulary Continuous Speech Recognition

Wai Lau, Tan Lee, Yin Wing Wong, P. C. Ching

Department of Electronic Engineering, The Chinese University of Hong Kong

Tone recognition is indispensable in automatic speech recognition of tonal languages like Chinese. Among the many Chinese dialects, Cantonese is well known of being rich in tones, This paper presents a comprehensive study on speaker-independent tone recognition in continuous Cantonese speech. Tone features are derived, on short-time basis, from syllable-wide F0 and energy profiles. A novel technique of moving-window normalization is proposed to effectively reduce undesirable fluctuation of the feature parameters. This technique allows on-the-fly and adaptive estimation of the speaker's intrinsic pitch range. Conventional HMM based approach is employed to recognize the normalized tone features. Using context-dependent tone models, an accuracy of 66.4% has been attained. The tone recognizer is then integrated into a Cantonese LVCSR system using the method of lattice expansion. Experimental results show that tone information, if reliably acquired and properly utilized, can contribute to noticeable improvement of the overall performance of the LVCSR system.


Full Paper

Bibliographic reference.  Lau, Wai / Lee, Tan / Wong, Yin Wing / Ching, P. C. (2000): "Incorporating tone information into Cantonese large-vocabulary continuous speech recognition", In ICSLP-2000, vol.2, 883-886.