EUROSPEECH 2003 - INTERSPEECH 2003
In many telephony applications that use speech recognition, it is important to identify and reject out-of-vocabulary words or utterances without keywords by means of utterance verification (UV). Typically, UV is performed based on the likelihood ratio of the target model versus an alternative model. The "goodness" of the models and the particular criteria used for estimating these models can have significant impact on its performance. Because the UV problem can be considered as a two-class classification problem, minimum classification error (MCE) training is a natural choice. Earlier work has focused on MCE training to reduce total classification errors. In this paper, we extend the MCE approach to minimize the error rates. In particular, we focus on the error rates at certain operating points and show how this can result in a significant EER reduction for phone verification on the TIMIT and a non-native kids corpus. While the particular technique is developed on utterance verification, it can also be generalized for other verification tasks such as speaker verification.
Bibliographic reference. Au, Wing-Hei / Siu, Man-Hung (2003): "A new approach to minimize utterance verification error rate for a specific operating point", In EUROSPEECH-2003, 909-912.