ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Soft margin estimation with various separation levels for LVCSR

Jinyu Li, Zhi-Jie Yan, Chin-Hui Lee, Ren-Hua Wang

We continue our previous work on soft margin estimation (SME) to large vocabulary continuous speech recognition (LVCSR) in two new aspects. The first is to formulate SME with different unit separation. SME methods focusing on string-, word-, and phone-level separation are defined. The second is to compare SME with all the popular conventional discriminative training (DT) methods, including maximum mutual information estimation (MMIE), minimum classification error (MCE), and minimum word/phone error (MWE/MPE). Tested on the 5k-word Wall Street Journal task, all the SME methods achieves a relative word error rate (WER) reduction from 17% to 25% over our baseline. Among them, phone-level SME obtains the best performance. Its performance is slightly better than that of MPE, and much better than those of other conventional DT methods. With the comprehensive comparison with conventional DT methods, SME demonstrates its success on LVCSR tasks.

doi: 10.21437/Interspeech.2008-100

Cite as: Li, J., Yan, Z.-J., Lee, C.-H., Wang, R.-H. (2008) Soft margin estimation with various separation levels for LVCSR. Proc. Interspeech 2008, 269-272, doi: 10.21437/Interspeech.2008-100

  author={Jinyu Li and Zhi-Jie Yan and Chin-Hui Lee and Ren-Hua Wang},
  title={{Soft margin estimation with various separation levels for LVCSR}},
  booktitle={Proc. Interspeech 2008},