11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Sparse Auto-Associative Neural Networks: Theory and Application to Speech Recognition

Garimella S. V. S. Sivaram, Sriram Ganapathy, Hynek Hermansky

Johns Hopkins University, USA

This paper introduces the sparse auto-associative neural network (SAANN) in which the internal hidden layer output is forced to be sparse. This is achieved by adding a sparse regularization term to the original reconstruction error cost function, and updating the parameters of the network to minimize the overall cost. We show applicability of this network to phoneme recognition by extracting sparse hidden layer outputs (used as features) from a network which is trained using perceptual linear prediction (PLP) cepstral coefficients in an unsupervised manner. Experiments with the SAANN features on a state-of-the-art TIMIT phoneme recognition system show a relative improvement in phoneme error rate of 5.1% over the baseline PLP features.

Full Paper

Bibliographic reference.  Sivaram, Garimella S. V. S. / Ganapathy, Sriram / Hermansky, Hynek (2010): "Sparse auto-associative neural networks: theory and application to speech recognition", In INTERSPEECH-2010, 2270-2273.