In tandem systems, the outputs of multi-layer perceptron (MLP) classifiers have been successfully used as features for HMM-based automatic speech recognition. In this paper, we propose a datadriven clustered hierarchical tandem system that yields improved performance on a large-vocabulary broadcast news transcription task. The complicated global learning for a large monolithic MLP classifier is divided into simpler tasks, in which hierarchical structures clustered based on the outputs of a monolithic MLP are used to alleviate phone confusion. The proposed approach yields error rate reductions of up to 16.4% over MFCC features alone.
Bibliographic reference. Chang, Shuo-Yiin / Lee, Lin-shan (2008): "Data-driven clustered hierarchical tandem system for LVCSR", In INTERSPEECH-2008, 2250-2253.