9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

A Neural Network Based Nonlinear Feature Transformation for Speech Recognition

Hongbing Hu, Stephen A. Zahorian

Binghamton University, USA

A neural network based feature dimensionality reduction for speech recognition is described for accurate phonetic speech recognition. In our previous work, a neural network based nonlinear principal component analysis (NLPCA) was proposed as a dimensionality reduction approach for speech features. It was shown that the reduced dimensionality features are very effective for representing data for vowel classification. In this paper, we extend this neural network based NLPCA approach for phonetic recognition using continuous speech. The reduced dimensionality features obtained with NLPCA are used as the features for HMM phone models. Experimental evaluation using the TIMIT database shows that recognition accuracies with NLPCA reduced dimensionality features are higher than recognition rates obtained with original features, especially when a small number of states and mixtures are used for HMM phonetic models.

Full Paper

Bibliographic reference.  Hu, Hongbing / Zahorian, Stephen A. (2008): "A neural network based nonlinear feature transformation for speech recognition", In INTERSPEECH-2008, 1533-1536.