ITRW on
Adaptation Methods for Speech Recognition

August 29-30, 2001
Sophia Antipolis, France

Pronunciation and Acoustic Model Adaptation for Improving Multilingual Speech Recognition

Jilei Tian, Imre Kiss and Olli Viikki

Speech and Audio Systems Laboratory, Nokia Research Center, Tampere, Finland

In this paper, we address the importance of pronunciation and acoustic model adaptation in multilingual speech recognition. When aiming at modeling several languages simultaneously, the degree of speaker and language variability is even greater than when concentrating on only one language. To compensate the pronunciation variability across various speaker, bi-lingual pronunciation modeling is proposed. Once the appropriate pronunciation has been found, the unused transcription is removed in order to prevent the expansion of the vocabulary size. To further compensate the mismatches between the multilingual acoustic models and the speaker's pronunciation, MAP on-line acoustic model adaptation is applied. Experimental results with 12 languages indicate the efficiency of the joint use of these techniques. Compared with the non-adapted multilingual system, the bi-lingual pronunciation modeling and on-line acoustic model adaptation produced the average cross-language error rate reduction of 67.6% in the clean, and 55.4% in the noisy operating conditions.

Full Paper

Bibliographic reference.  Tian, Jilei / Kiss, Imre / Viikki, Olli (2001): "Pronunciation and acoustic model adaptation for improving multilingual speech recognition", In Adaptation-2001, 131-134.