ASR2000 - Automatic Speech Recognition: Challenges for the new Millenium

September 18-20, 2000
Paris, France

Combination and Joint Training of Acoustic Classifiers for Speech Recognition

Katrin Kirchhoff and Jeff A. Bilmes

SSLI Laboratory, Department of Electrical Engineering, University of Washington, Seattle, WA, USA

Classifier combination is a technique that often provides signifi- cant improvements in accuracy, and also furnishes a useful mechanism to support multi-modal information sources. In this paper we discuss the problem of acoustic classifier combination in speech recognition systems. We present new techniques that generalize previously used combination rules, such as the mean, product, min, and max functions. These new rules have continuous and differentiable forms and can thus not only be used for combination of independently trained classifiers but also as objective functions in new joint classifier training schemes.We demonstrate the application of these rules to both combination and joint training using different input features, and we analyze their effects on word recognition accuracy. We find a significant word-error improvement over the product rule when jointly training and combining multiple systems using a generalization of the product rule.


Full Paper (PDF)   Full Paper (Zipped Postscript)

Bibliographic reference.  Kirchhoff, Katrin / Bilmes, Jeff A. (2000): "Combination and joint training of acoustic classifiers for speech recognition", In ASR-2000, 17-23.