September 22-25, 1997
In this paper we described an efficient method to bootstrap continuously spoken, large vocabulary speech recognition systems by multilingual phoneme sets. To evaluate this techniques we collected the multilingual database GlobalPhone which currently consists of 9 different languages. A multilingual recognizer (MULTI) based on the four languages German, English, Japanese and Spanish was developed to serve as a source system. Likewise this system is very useful for language identification and achieves 100% language identification rate. Based on the MULTI system we evaluated our bootstrap technique on such completely different languages as Chinese, Croatian, and Turkish.
Bibliographic reference. Schultz, Tanja / Waibel, Alex (1997): "Fast bootstrapping of LVCSR systems with multilingual phoneme sets", In EUROSPEECH-1997, 371-374.