8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Pinched Lattice Minimum Bayes Risk Discriminative Training for Large Vocabulary Continuous Speech Recognition

Vlasios Doumpiotis, William Byrne

Johns Hopkins University, USA

Iterative estimation procedures that minimize empirical risk based on general loss functions such as the Levenshtein distance have been derived as extensions of the Extended Baum Welch algorithm. While reducing expected loss on training data is a desirable training criterion, these algorithms can be difficult to apply. They are unlike MMI estimation in that they require an explicit listing of the hypotheses to be considered and in complex problems such lists tend to be prohibitively large. To overcome this difficulty, modeling techniques originally developed to improve search efficiency in Minimum Bayes Risk decoding can be used to transform these estimation algorithms so that exact update, risk minimization procedures can be used for complex recognition problems. Experimental results in two large vocabulary speech recognition tasks show improvements over conventionally trained MMIE models.

Full Paper

Bibliographic reference.  Doumpiotis, Vlasios / Byrne, William (2004): "Pinched lattice minimum Bayes risk discriminative training for large vocabulary continuous speech recognition", In INTERSPEECH-2004, 1717-1720.