Sixth International Conference on Spoken Language Processing
NLPR has been with long efforts on Mandarin speech recognition. This paper reports our recent process in this field with several significant novel characteristics: 1) Very large speech databases are used to learn more robust acoustic model; 2) Acoustic model has evolved from non-tonal class-triphone to tonal class-triphone based on tone-embedded decision tree, namely unified tone & triphone modeling. The experimental results for large test databases show 1) hybrid databases are helpful for performance improvement; 2) tone information is very useful and could contribute 20% character error reduction for high quality "863" database; and 3) one-pass decoder is an more efficient framework than multi-pass decoder, especially when LM and AM are accurate.
Bibliographic reference. Gao, Sheng / Xu, Bo / Zhang, Hong / Zhao, Bing / Li, Chengrong / Huang, Taiyi (2000): "Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR", In ICSLP-2000, vol.3, 798-801.