Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

A Discriminative Training Procedure Based on Language Model and Dictionary for LVCSR

Daniel Willett, Stefan Müller, Gerhard Rigoll

Department of Computer Science, Faculty of Electrical Engineering, Gerhard-Mercator-University Duisburg, Germany

In today's HMM-based speech recognition systems, the parameters are most commonly estimated according to the Maximum Likelihood criterion. Because of limited training data, however, discriminative objectives provide better parameter estimates with respect to the Maximum A-Posteriori decision used for decoding. The question of which distribution functions to discriminate from which and to what degree is the most crucial when performing discriminative parameter estimation. This is particularly dificult because beside the distribution functions, the recognition procedure is restricted and guided by several other sources of information, such as language model and transition matrices. This paper extends the approach presented in [10] to the case of triphones, refines the theory and estimation of the state-to-state confusion metric and proposes an approximation that allows the application of the approach on context-dependent systems with reasonable computational cost. The evaluation is performed on continuous HMM speech recognition systems for the WSJ0 5k-task. The results prove the practicability of the approach and its extensions.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Willett, Daniel / Müller, Stefan / Rigoll, Gerhard (1999): "A discriminative training procedure based on language model and dictionary for LVCSR", In EUROSPEECH'99, 2757-2760.