In this paper two common discriminative training criteria, maximum mutual information (MMI) and minimum phone error (MPE), are investigated. Two main issues are addressed: sensitivity to different lattice segmentations and the contribution of the parameter estimation method. It is noted that MMI and MPE may benefit from different lattice segmentation strategies. The use of discriminative criterion values as the measure of model goodness is shown to be problematic as the recognition results do not correlate well with these measures. Moreover, the parameter estimation method clearly affects the recognition performance of the model irrespective of the value of the discriminative criterion. Also the dependence on the recognition task is demonstrated by example with two Finnish large vocabulary dictation tasks used in the experiments.
Bibliographic reference. Pylkkönen, Janne (2009): "Investigations on discriminative training in large scale acoustic model estimation", In INTERSPEECH-2009, 220-223.