ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

A discriminative training procedure based on language model and dictionary for LVCSR

Daniel Willett, Stefan Müller, Gerhard Rigoll

In today's HMM-based speech recognition systems, the parameters are most commonly estimated according to the Maximum Likelihood criterion. Because of limited training data, however, discriminative objectives provide better parameter estimates with respect to the Maximum A-Posteriori decision used for decoding. The question of which distribution functions to discriminate from which and to what degree is the most crucial when performing discriminative parameter estimation. This is particularly dificult because beside the distribution functions, the recognition procedure is restricted and guided by several other sources of information, such as language model and transition matrices. This paper extends the approach presented in [10] to the case of triphones, refines the theory and estimation of the state-to-state confusion metric and proposes an approximation that allows the application of the approach on context-dependent systems with reasonable computational cost. The evaluation is performed on continuous HMM speech recognition systems for the WSJ0 5k-task. The results prove the practicability of the approach and its extensions.


doi: 10.21437/Eurospeech.1999-607

Cite as: Willett, D., Müller, S., Rigoll, G. (1999) A discriminative training procedure based on language model and dictionary for LVCSR. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 2757-2760, doi: 10.21437/Eurospeech.1999-607

@inproceedings{willett99_eurospeech,
  author={Daniel Willett and Stefan Müller and Gerhard Rigoll},
  title={{A discriminative training procedure based on language model and dictionary for LVCSR}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={2757--2760},
  doi={10.21437/Eurospeech.1999-607}
}