Neural networks (NNs) are discriminative classifiers which have been successfully integrated with hidden Markov models (HMMs), either in the hybrid NN/HMM or tandem connectionist systems. Typically, the NNs are trained with the frame-based cross-entropy criterion to classify phonemes or phoneme states. However, for word recognition, the word error rate is more closely related to the sequence classification criteria, such as maximum mutual information and minimum phone error. In this paper, the lattice-based sequence classification criteria are used to train the NNs in the hybrid NN/HMM system and the tandem system. A product-ofexpert- based factorization and smoothing scheme is proposed for the hybrid system to scale the lattice-based NN training up to 6000 triphone states. Experimental results on the WSJCAM0 reveal that the NNs trained with the sequential classification criterion yield a 24.2% relative improvement compared to the cross-entropy trained NNs for the hybrid system.
Bibliographic reference. Wang, Guangsen / Sim, Khe Chai (2011): "Sequential classification criteria for NNs in automatic speech recognition", In INTERSPEECH-2011, 441-444.