Online generation of acoustic models for multilingual speech recognition

Martin Raab, Guillermo Aradilla, Rainer Gruhn, Elmar Nöth

Our goal is to provide a multilingual speech based Human Machine Interface for in-car infotainment and navigation systems. The multilinguality is for example needed for music player control via speech as artist and song names in the globalized music market come from many languages. Another frequent use case is the input of foreign navigation destinations via speech. In this paper we propose approximated projections between mixtures of Gaussians that allow the generation of the multilingual system from monolingual systems. This makes the creation of the multilingual systems on an embedded system possible with the benefit that training and maintenance effort remain unchanged compared to the provision of monolingual systems. We also sketch how this algorithm can help together with our previous work to have an efficient architecture for multilingual speech recognition on embedded devices.

doi: 10.21437/Interspeech.2009-759

Cite as: Raab, M., Aradilla, G., Gruhn, R., Nöth, E. (2009) Online generation of acoustic models for multilingual speech recognition. Proc. Interspeech 2009, 2999-3002, doi: 10.21437/Interspeech.2009-759

