EUROSPEECH 2003 - INTERSPEECH 2003
We illustrate the development of a universal phone recognizer for conversational telephone-quality speech. The acoustic models for this system were trained in a novel fashion and with a wide variety of language data, thus permitting it to recognize most of the world's major phonemic categories. Moreover, with push-button ease, this recognizer can automatically reconfigure itself to apply the strongest language model in its inventory to whatever language it is used on. In this paper, we not only describe this system, but we also provide performance measurements for it using extensive testing material both from languages in its training set as well as from a language it has never seen. Surprisingly, the recognizer produces near-equivalent performance between the two types of data thus showing its true universality. This recognizer presents a viable solution for processing conversational, telephone-quality speech in any language - even in low-density languages.
Bibliographic reference. Walker, B.D. / Lackey, B.C. / Muller, J.S. / Schone, P.J. (2003): "Language-reconfigurable universal phone recognition", In EUROSPEECH-2003, 153-156.