ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Effects of language mixing for automatic recognition of Cantonese-English code-mixing utterances

Houwei Cao, P. C. Ching, Tan Lee

While automatic speech recognition of either Cantonese or English alone has achieved a great degree of success, recognition of Canton- English code-mixing speech is not as trivial. This paper attempts to analyze the effect of language mixing on recognition performance of code-mixing utterances. By examining the recognition results of Canton-English code-mixing speech, where Canton is the matrix language and English is the embedded language, we noticed that recognition accuracy of the embedded language plays a significant role to the overall performance. In particular, significant performance degradation is found in the matrix language if the embedded words can not be recognized correctly. We also studied the error propagation effect of the embedded English. The results show that the error in embedded English words may propagate to two neighboring Cantonese syllables. Finally, analysis is carried out to determine the influencing factors for recognition performance in embedded English.


doi: 10.21437/Interspeech.2009-762

Cite as: Cao, H., Ching, P.C., Lee, T. (2009) Effects of language mixing for automatic recognition of Cantonese-English code-mixing utterances. Proc. Interspeech 2009, 3011-3014, doi: 10.21437/Interspeech.2009-762

@inproceedings{cao09_interspeech,
  author={Houwei Cao and P. C. Ching and Tan Lee},
  title={{Effects of language mixing for automatic recognition of Cantonese-English code-mixing utterances}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={3011--3014},
  doi={10.21437/Interspeech.2009-762}
}