We describe our continuing efforts to improve the UW-SRI-ICSI Mandarin broadcast speech recognizer. This includes increasing acoustic and text training data, adding discriminative features, incorporating frame-level discriminative training criterion, multiple-pass acoustic model (AM) cross adaptation, language model (LM) genre adaptation and system combination. The net effect without LM adaptation was a 24%-64% relative reduction in character error rates (CERs) on a variety of test sets. In addition, LM adaptation gave us another 6% of relative CER reduction on broadcast conversations.
Bibliographic reference. Hwang, Mei-Yuh / Wang, Wen / Lei, Xin / Zheng, Jing / Cetin, Ozgur / Peng, Gang (2007): "Advances in Mandarin broadcast speech recognition", In INTERSPEECH-2007, 2613-2616.