8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Dimensionality Reduction of Speech Features Using Nonlinear Principal Components Analysis

Stephen A. Zahorian (1), Tara Singh (2), Hongbing Hu (1)

(1) Binghamton University, USA
(2) Old Dominion University, USA

One of the main practical difficulties for automatic speech recognition is the large dimensionality of acoustic feature spaces and the subsequent training problems collectively referred to as the "curse of dimensionality." Many linear techniques, most notably principal components analysis (PCA) and linear discriminant analysis (LDA) and several variants have been used to reduce dimensionality while attempting to preserve variability and discriminability of classes in the feature space. However, these orthogonal rotations of the feature space are suboptimal if data are distributed primarily on curved subspaces embedded in the higher dimensional feature spaces. In this paper, two neural network based nonlinear transformations are used to represent speech data in reduced dimensionality subspaces. It is shown that a subspace computed with the explicit intent of maximizing classification accuracy is far superior to a subspace derived as to minimize mean square representation error.

Full Paper

Bibliographic reference.  Zahorian, Stephen A. / Singh, Tara / Hu, Hongbing (2007): "Dimensionality reduction of speech features using nonlinear principal components analysis", In INTERSPEECH-2007, 1134-1137.