15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

A Comparison of Training Approaches for Discriminative Segmental Models

Hao Tang, Kevin Gimpel, Karen Livescu

Toyota Technological Institute at Chicago, USA

Segmental models such as segmental conditional random fields have had some recent success in lattice rescoring for speech recognition. They provide a flexible framework for incorporating a wide range of features across different levels of units, such as phones and words. However, such models have mainly been trained by maximizing conditional likelihood, which may not be the best proxy for the task loss of speech recognition. In addition, there has been little work on designing cost functions as surrogates for the word error rate. In this paper, we investigate various losses and introduce a new cost function for training segmental models. We compare lattice rescoring results for multiple tasks and also study the impact of several choices required when optimizing these losses.

Full Paper

Bibliographic reference.  Tang, Hao / Gimpel, Kevin / Livescu, Karen (2014): "A comparison of training approaches for discriminative segmental models", In INTERSPEECH-2014, 1219-1223.