ISCA Archive ISCSLP 2008
ISCA Archive ISCSLP 2008

Evaluation and Analysis of Minimum Phone Error Training and Its Modified Versions for Large Vocabulary Mandarin Speech Recognition

Yung-Jen Cheng, Che-Kuang Lin, Lin-Shan Lee

This paper reports a detailed study on Minimum Phone Error (MPE), Minimum Phone Frame Error (MPFE), and a physical-state level version of Minimum Bayes Risk (sMBR) training, as well as several modified versions of them, for transcription of large vocabulary Mandarin broadcast news. We found the results are quite different from these observed previously for English and Arabic broadcast news tasks[1], in particular the trends are different when different performance measures (word and character accuracies) are used. This makes the difference for Chinese language, for which character accuracy is usually more important, while word accuracy is commonly used for other languages. Modifications to these approaches tested here include considering the variable phone length and applying penalties to erroneous frames. They were shown to be able to significantly improve character accuracy in our experiments. Index Terms— Discriminative training, Minimum Phone Error, Minimum Phone Frame Error, Minimum Bayes Risk


Cite as: Cheng, Y.-J., Lin, C.-K., Lee, L.-S. (2008) Evaluation and Analysis of Minimum Phone Error Training and Its Modified Versions for Large Vocabulary Mandarin Speech Recognition. Proc. International Symposium on Chinese Spoken Language Processing, 157-160

@inproceedings{cheng08_iscslp,
  author={Yung-Jen Cheng and Che-Kuang Lin and Lin-Shan Lee},
  title={{Evaluation and Analysis of Minimum Phone Error Training and Its Modified Versions for Large Vocabulary Mandarin Speech Recognition}},
  year=2008,
  booktitle={Proc. International Symposium on Chinese Spoken Language Processing},
  pages={157--160}
}