7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Likelihood Combination and Recognition Output Voting for the Decoding of Non-Native Speech with Multilingual HMMs

V. Fischer, E. Janke, S. Kunzmann

IBM Voice Systems, Germany

In this paper we report on the combination of multilingual Hidden Markov Models for the recognition of non-native speech. Using a digit recognition task as an example, we first demonstrate the bene- fits of bilingual acoustic models that incorporate training data from both the target language and the speakersí native language, and then compare two different recognizer combination methods, namely voting on recognition output (ROVER) and frame based, time synchronous likelihood combination. Finally, we demonstrate the usefulness of the proposed methods for speakers whose native language is not in the training data.

Full Paper

Bibliographic reference.  Fischer, V. / Janke, E. / Kunzmann, S. (2002): "Likelihood combination and recognition output voting for the decoding of non-native speech with multilingual HMMs", In ICSLP-2002, 489-492.