Sixth International Conference on Spoken Language Processing
October 16-20, 2000
Improved Tone Recognition by Normalizing for Coarticulation and Intonation Effects
Chao Wang, Stephanie Seneff
Spoken Language Systems Group, Laboratory for Computer Science,
Massachusetts Institute of Technology, Cambridge, MA, USA
We have previously demonstrated that tone modeling improved
speech recognition on a digit corpus . In this work, we further
improve tone recognition by normalizing for both tone coarticulation
and intonation effects. The tone classification errors on
continuous digit strings were reduced by 26.1% from the baseline,
when the effects of F0 downdrift, phrase boundary and tone
coarticulation were normalized. We also applied the same approach
to conversational speech from the YINHE domain , and
obtained similar improvements. The word error rate on spontaneous
YINHE data was reduced by 16.5% when a simple fourtone
model was applied to resort recognizer 10-best outputs.
- C. Wang and S. Seneff, "A study of tones and tempo in continuous
Mandarin digit strings and their application in telephone quality
speech recognition," in Proc. ICSLP’98, Sydney, Australia, pp. 635-638, 1998.
- C. Wang, J. Glass, H. Meng, J. Polifroni, S. Seneff and V. Zue,
"Yinhe: A Mandarin Chinese version of the Galaxy system," in Eurospeech’
97, Rhodes, Greece, pp. 351-354, 1997.
Wang, Chao / Seneff, Stephanie (2000):
"Improved tone recognition by normalizing for coarticulation and intonation effects",
In ICSLP-2000, vol.2, 83-86.