INTERSPEECH 2012
13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Comparing Different Acoustic Modeling Techniques for Multilingual Boosting

David Imseng (1,2), John Dines (1), Petr Motlicek (1), Philip N. Garner (1), Hervé Bourlard (1,2)

(1) Idiap Research Institute, Martigny, Switzerland
(2) Ecole Polytechnique Fédérale, Lausanne (EPFL), Switzerland

In this paper, we explore how different acoustic modeling techniques can benefit from data in languages other than the target language. We propose an algorithm to perform decision tree state clustering for the recently proposed Kullback-Leibler divergence based hidden Markov models (KL-HMM) and compare it to subspace Gaussian mixture modeling (SGMM). KL-HMM can exploit multilingual information in the form of universal phoneme posterior features and SGMM benefits from a universal background model that can be trained on multilingual data. Taking the Greek SpeechDat(II) data as an example, we show that KL-HMM performs best for small amounts of target language data.

Index Terms: Speech recognition, multilingual acoustic modeling, under-resourced languages

Full Paper

Bibliographic reference.  Imseng, David / Dines, John / Motlicek, Petr / Garner, Philip N. / Bourlard, Hervé (2012): "Comparing different acoustic modeling techniques for multilingual boosting", In INTERSPEECH-2012, 1191-1194.