7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Combining Search Spaces of Heterogeneous Recognizers for Improved Speech Recogniton

Xiang Li, Rita Singh, Richard M. Stern

Carnegie Mellon University, USA

In speech recognition systems, information from multiple sources such as different feature streams or acoustic models can be combined in many different ways to yield better recognition performance. It is theoretically expected that the best performance is obtainable through the simultaneous use of all sources of information, in a system capable of using these in parallel. Such systems, however, are extremely complex and difficult to construct. In this paper we propose a simple alternative criterion for combination which can factorize the complex recognizer into several simple recognizers, each of which is based on a single source of information. We use this criterion in simple experiments which combine lattices from recognizers built with different feature streams. Experimental results obtained on five different corpora show that the proposed method is effective in improving recognition performance.

