Interspeech'2005 - Eurospeech
The most common target of multilingual ASR aims at covering various speakers from various languages. The problem we address in this article is more specifically an asymmetrical bilingual scenario, where the same speaker may insert in his speech some foreign words using foreign pronunciations. This is a frequent situation for French as spoken in Canada, where English proper names are often spoken using English pronunciations. We explore in this article a new way of using multilingual models by enhancing a monolingual system in a measured manner (Flavoured Acoustic Models). We also present an innovative bilingual spelling to sound system based on separate decision trees, providing balanced alternatives in both languages. Our ASR results over the telephony channel show that both technologies associated with one another outperform by up to 75% monolingual systems on English pronunciations without degrading word error rate on French pronunciations.
Bibliographic reference. Lejeune, R. / Baude, J. / Tchong, C. / Crepy, H. / Waast-Richard, C. (2005): "Flavoured acoustic model and combined spelling to sound for asymmetrical bilingual environment", In INTERSPEECH-2005, 3325-3328.