7th International Conference on Spoken Language Processing
September 16-20, 2002
This article shows that the Minimum Classification Error (MCE) criterion function commonly used for discriminative design of speech recognition systems is equivalent to a Parzen window based estimate of the theoretical Bayes classification risk. In this analysis, each training token is mapped to the center of a Parzen kernel in the domain of a suitably defined random variable. The kernels are summed to produce a density estimate; this estimate in turn can easily be integrated over the domain of incorrect classifications, yielding the risk estimate. The expression of risk for each kernel can be seen to correspond directly to the usual MCE loss function. The resulting risk estimate can be minimized by suitable adaptation of the recognition system parameters that determine the mapping from training token to kernel center. This analysis provides a novel link between the MCE empirical cost measured on a finite training set and the theoretical Bayes classification risk.
Bibliographic reference. McDermott, Erik / Katagiri, Shigeru (2002): "Classification error from the theoretical Bayes classification risk", In ICSLP-2002, 2465-2468.