ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Optimization on decoding graphs by discriminative training

Shiuan-Sung Lin, Fran├žois Yvon

The three main knowledge sources used in the automatic speech recognition (ASR), namely the acoustic models, a dictionary and a language model, are usually designed and optimized in isolation. Our previous work [1] proposed a methodology for jointly tuning these parameters, based on the integration of the resources as a finite-state graph, whose transition weights are trained discriminatively. This paper extends the training framework to a large vocabulary task, the automatic transcription of French broadcast news. We propose several fast decoding techniques to make the training practical. Experiments show that a reduction of 1% absolute of word error rate (WER) can be obtained. We conclude the paper with an appraisal of the potential of this approach on large vocabulary ASR tasks.


doi: 10.21437/Interspeech.2007-487

Cite as: Lin, S.-S., Yvon, F. (2007) Optimization on decoding graphs by discriminative training. Proc. Interspeech 2007, 1737-1740, doi: 10.21437/Interspeech.2007-487

@inproceedings{lin07c_interspeech,
  author={Shiuan-Sung Lin and Fran├žois Yvon},
  title={{Optimization on decoding graphs by discriminative training}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={1737--1740},
  doi={10.21437/Interspeech.2007-487}
}