12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Individual Error Minimization Learning Framework and its Applications to Speech Recognition and Utterance Verification

Sunghwan Shin (1), Ho-Young Jung (2), Biing-Hwang Juang (1)

(1) Georgia Institute of Technology, USA
(2) ETRI, Korea

In this paper, we extend the individual recognition error minimization criteria, MDE/MIE/MSE [1] in word-level and apply them to word recognition and verification tasks, respectively. In order to effectively reduce potential errors in word-level, we expand the training token selection scheme to be more appropriate for word-level learning framework, by taking into account neighboring words and by covering internal phonemes in each training word. Then, we examine the proposed word-level learning criteria on the TIMIT word recognition task and further investigate individual rejection performance of the recognition errors in utterance verification (UV). Experimental results confirm that each of the word-level objective criteria results in primarily reducing the corresponding target error type, respectively. The rejection rates of insertion and substitution errors are also improved within MIE and MSE criteria, which lead to additional word error rate reduction after the rejection.


  1. Shin, S., Jung, H.-Y. and Juang, B.-H., “Discriminative Training for Direct Minimization of Deletion, Insertion and Substitution Errors,” in ICASSP 2011, pp. 5328-5331, May 2011

Full Paper

Bibliographic reference.  Shin, Sunghwan / Jung, Ho-Young / Juang, Biing-Hwang (2011): "Individual error minimization learning framework and its applications to speech recognition and utterance verification", In INTERSPEECH-2011, 1713-1716.