ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Optimization on decoding graphs by discriminative training

Shiuan-Sung Lin, Fran├žois Yvon

The three main knowledge sources used in the automatic speech recognition (ASR), namely the acoustic models, a dictionary and a language model, are usually designed and optimized in isolation. Our previous work [1] proposed a methodology for jointly tuning these parameters, based on the integration of the resources as a finite-state graph, whose transition weights are trained discriminatively. This paper extends the training framework to a large vocabulary task, the automatic transcription of French broadcast news. We propose several fast decoding techniques to make the training practical. Experiments show that a reduction of 1% absolute of word error rate (WER) can be obtained. We conclude the paper with an appraisal of the potential of this approach on large vocabulary ASR tasks.

doi: 10.21437/Interspeech.2007-487

Cite as: Lin, S.-S., Yvon, F. (2007) Optimization on decoding graphs by discriminative training. Proc. Interspeech 2007, 1737-1740, doi: 10.21437/Interspeech.2007-487

  author={Shiuan-Sung Lin and Fran├žois Yvon},
  title={{Optimization on decoding graphs by discriminative training}},
  booktitle={Proc. Interspeech 2007},