ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

An investigation of acoustic models for multilingual code-switching

Christopher M. White, Sanjeev Khudanpur, James K. Baker

Multilingual speech processing continues to develop as speech technology spreads to heterogeneous clients and applications. We address a distinct problem of code-switching - the spontaneous but occasional use, within speech in one language (referred to as L1), of words, phrases, expressions or idioms from a second language (L2). We examine two alternatives for modeling the acoustics of such words: creation of L1 pronunciations for the out-of-language (OOL) words for use with L1 acoustic models, and retention of their L2 pronunciations for use with multilingual acoustic models. We test the hypothesis that the latter is a better acoustic model for OOL words. We develop a set of lexica in IPA form, a global phoneme inventory, and handle the problem of L2 word pronunciation by creating linguistically motivated pairwise mappings. We show that retention of L2 pronunciations with multilingual acoustic models better explains the observations when restricted to a forced alignment.

doi: 10.21437/Interspeech.2008-667

Cite as: White, C.M., Khudanpur, S., Baker, J.K. (2008) An investigation of acoustic models for multilingual code-switching. Proc. Interspeech 2008, 2691-2694, doi: 10.21437/Interspeech.2008-667

  author={Christopher M. White and Sanjeev Khudanpur and James K. Baker},
  title={{An investigation of acoustic models for multilingual code-switching}},
  booktitle={Proc. Interspeech 2008},