This paper presents three different methods to develop multilingual phone models for flexible speech recognition tasks. The main goal of our investigations is to find multilingual speech units which work equally well in many languages. With this universal set it is possible to build speech recognition systems for a variety of languages. One advantage of this approach is to share acoustic-phonetic parameters in a HMM based speech recognition system. The multilingual approach starts with the phone set of six languages ending up with 232 language-dependent and context-independent phone models. Then, we developed three different methods to map the language-dependent models to a multilingual phone set. The first method is a direct mapping to the phone set of the International Phonetic Association (IPA). In the second approach we apply an automatic clustering algorithm for the phone models. The third method exploits the similarities of single mixture components of the language-dependent models. Like the first method the language specific models are mapped to the IPA inventory. In the second step an agglomerative clustering is performed on density level to find regions of similarities between the phone models of different languages. The experiments carried out with the SpeechDat(M) database show that the third method yields in almost the same recognition rate as with language-dependent models. However, using this method we observe a huge reduction of the number of densities in the multilingual system.
Cite as: Köhler, J. (1999) Comparing three methods to create multilingual phone models forvocabulary independent speech recognition tasks. Proc. Multi-Lingual Interoperability in Speech Technology, 53-58
@inproceedings{kohler99_mist, author={Joachim Köhler}, title={{Comparing three methods to create multilingual phone models forvocabulary independent speech recognition tasks}}, year=1999, booktitle={Proc. Multi-Lingual Interoperability in Speech Technology}, pages={53--58} }