Code-switching speech is an utterance containing two or more languages. Usually, the switching linguistic unit is in clause or word levels. In this paper, a two-stage framework is proposed, containing a language identifier and then a speech recognizer, to evaluate on a Mandarin-Taiwanese code-switching utterance. In the language identifier, we use multiple cues including acoustic, prosodic and phonetic features. In order to integrate the cues to distinguish one language from another, we used a maximum a posteriori decision rule to connect an acoustic model, a duration model and a language model. In the experiments, we have achieved 34.5% (LID) and 17.7% (ASR) error rate reduction comparing with one stage LVCSR-based system.
Bibliographic reference. Lyu, Dau-Cheng / Lyu, Ren-Yuan (2008): "Language identification on code-switching utterances using multiple cues", In INTERSPEECH-2008, 711-714.