This paper introduces the sparse auto-associative neural network (SAANN) in which the internal hidden layer output is forced to be sparse. This is achieved by adding a sparse regularization term to the original reconstruction error cost function, and updating the parameters of the network to minimize the overall cost. We show applicability of this network to phoneme recognition by extracting sparse hidden layer outputs (used as features) from a network which is trained using perceptual linear prediction (PLP) cepstral coefficients in an unsupervised manner. Experiments with the SAANN features on a state-of-the-art TIMIT phoneme recognition system show a relative improvement in phoneme error rate of 5.1% over the baseline PLP features.
Bibliographic reference. Sivaram, Garimella S. V. S. / Ganapathy, Sriram / Hermansky, Hynek (2010): "Sparse auto-associative neural networks: theory and application to speech recognition", In INTERSPEECH-2010, 2270-2273.