We perform large margin training of HMM acoustic parameters by maximizing a penalty function which combines two terms. The first term is a scale which gets multiplied with the Hamming distance between HMM state sequences to form a multi-label (or sequence) margin. The second term arises from constraints on the training data that the joint log-likelihoods of acoustic and correct word sequences exceed the joint log-likelihoods of acoustic and incorrect word sequences by at least the multi-label margin between the corresponding Viterbi state sequences. Using the soft-max trick, we collapse these constraints into a boosted MMI-like term. The resulting objective function can be efficiently maximized using extended Baum-Welch updates. Experimental results on multiple LVCSR tasks show a good correlation between the objective function and the word error rate.
Bibliographic reference. Saon, George / Povey, Daniel (2008): "Penalty function maximization for large margin HMM training", In INTERSPEECH-2008, 920-923.