NLPR has been with long efforts on Mandarin speech recognition. This paper reports our recent process in this field with several significant novel characteristics: 1) Very large speech databases are used to learn more robust acoustic model; 2) Acoustic model has evolved from non-tonal class-triphone to tonal class-triphone based on tone-embedded decision tree, namely unified tone & triphone modeling. The experimental results for large test databases show 1) hybrid databases are helpful for performance improvement; 2) tone information is very useful and could contribute 20% character error reduction for high quality "863" database; and 3) one-pass decoder is an more efficient framework than multi-pass decoder, especially when LM and AM are accurate.
Cite as: Gao, S., Xu, B., Zhang, H., Zhao, B., Li, C., Huang, T. (2000) Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 798-801, doi: 10.21437/ICSLP.2000-655
@inproceedings{gao00d_icslp, author={Sheng Gao and Bo Xu and Hong Zhang and Bing Zhao and Chengrong Li and Taiyi Huang}, title={{Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR}}, year=2000, booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)}, pages={vol. 3, 798-801}, doi={10.21437/ICSLP.2000-655} }