This paper describes our efforts in building a competitive Mandarin broadcast news speech recognizer. We successfully incorporated the most popular speech technologies into our system. More importantly, we present two novel algorithms in smoothing pitch features and segmenting Chinese characters into word units. Additionally, we propose to borrow the principle of pointwise mutual information for creating a Chinese word lexicon automatically. Our final system achieved 6.0% character error rate (CER) on dev04 and 16.0% on eval04, with simpler acoustic models, less training data, and simpler decoding architecture compared with other state-of-the-art systems, yet was equally competitive.
Cite as: Hwang, M.-Y., Lei, X., Wang, W., Shinozaki, T. (2006) Investigation on Mandarin broadcast news speech recognition. Proc. Interspeech 2006, paper 1916-Tue3A2O.3, doi: 10.21437/Interspeech.2006-371
@inproceedings{hwang06_interspeech, author={Mei-Yuh Hwang and Xin Lei and Wen Wang and Takahiro Shinozaki}, title={{Investigation on Mandarin broadcast news speech recognition}}, year=2006, booktitle={Proc. Interspeech 2006}, pages={paper 1916-Tue3A2O.3}, doi={10.21437/Interspeech.2006-371} }