15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Acoustic Features for Robust Classification of Mandarin Tones

Hongbing Hu, Stephen A. Zahorian, Peter Guzewich, Jiang Wu

Binghamton University, USA

For applications such as tone modeling and automatic tone recognition, smoothed F0 (pitch) all-voiced pitch tracks are desirable. Three pitch trackers that have been shown to give good accuracy for pitch tracking are YAAPT, YIN, and PRAAT. On tests with English and Japanese databases, for which ground truth pitch tracks are available by other means, we show that YAAPT has lower errors than YIN and PRAAT. We also experimentally compare the effectiveness of the three trackers for automatic classification of Mandarin tones. In addition to F0 tracks, a compact set of low-frequency spectral shape trajectories are used as additional features for automatic tone classification. A combination of pitch trajectories computed with YAAPT and spectral shape trajectories extracted from 800ms intervals for each tone results in tone classification accuracy of nearly 77%, a rate higher than human listeners achieve for isolated tonal syllables, and also higher than that obtained with the other two trackers.

Full Paper

Bibliographic reference.  Hu, Hongbing / Zahorian, Stephen A. / Guzewich, Peter / Wu, Jiang (2014): "Acoustic features for robust classification of Mandarin tones", In INTERSPEECH-2014, 1352-1356.