ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Language modeling for mixed language speech recognition using weighted phrase extraction

Ying Li, Pascale Fung

To train a code switching language model for mixed language speech recognition, we propose to assign weights to the sentence pairs in the parallel text data. The code switching language model which is composed of the code switching boundary prediction model, code switching translation model and reconstruction model is incorporated with a language for mixed language speech recognition. The code switching translation model which is trained using selected subsets of the sentence pairs in the parallel text data allows the decoder to make the decision whether a phrase is in the matrix language or in the embedded language. Moreover, we propose a weighting procedure while training the code switching translation model. We evaluate our methods on Mandarin-English code switching lecture speech and lunch conversations. Our proposed method reduces word error rate by a statistically significant 1.74% on the lecture speech, and by 1.29% on the lunch conversation over the conventional interpolated language model.


doi: 10.21437/Interspeech.2013-584

Cite as: Li, Y., Fung, P. (2013) Language modeling for mixed language speech recognition using weighted phrase extraction. Proc. Interspeech 2013, 2599-2603, doi: 10.21437/Interspeech.2013-584

@inproceedings{li13g_interspeech,
  author={Ying Li and Pascale Fung},
  title={{Language modeling for mixed language speech recognition using weighted phrase extraction}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={2599--2603},
  doi={10.21437/Interspeech.2013-584}
}