This paper presents a novel discriminative training technique for noisy speech recognition. First, we define a Frame Margin Probability (FMP) which denotes the difference of score of a frame on its right model and on its competing model. The frames with negative FMP values are regarded as confusable frames and the frames with positive FMP values are regarded as discriminable frames. Second, the confusable frames will be emphasized and the overly discriminable frames will be deweighted by an empirical weighting function. Then the acoustic model parameters are tuned using the weighted frames. By this kind of weighting, the confusable frames, which are often noisy, can contribute more to the acoustic model than those without weighting. We evaluate this technology using the Aurora standard database (TIdigits) and HTK3.3, and obtain a 15.9% WER reduction for noisy speech recognition and a 13.13% WER reduction for clean speech recognition compared with the MLE baseline systems.
Bibliographic reference. Li, Hao-Zheng / O'Shaughnessy, Douglas (2007): "Frame margin probability discriminative training algorithm for noisy speech recognition", In INTERSPEECH-2007, 38-41.