Previous work has considered methods for learning projections of high-dimensional acoustic representations to lower dimensional spaces. In this paper we apply the neighborhood components analysis (NCA) [2] method to acoustic modeling in a speech recognizer. NCA learns a projection of acoustic vectors that optimizes a criterion that is closely related to the classification accuracy of a nearest-neighbor classifier. We introduce regularization into this method, giving further improvements in performance. We describe experiments on a lecture transcription task, comparing projections learned using NCA and HLDA [1]. Regularized NCA gives a 0.7% absolute reduction in WER over HLDA, which corresponds to a relative reduction of 1.9%.
Cite as: Singh-Miller, N., Collins, M., Hazen, T.J. (2007) Dimensionality reduction for speech recognition using neighborhood components analysis. Proc. Interspeech 2007, 1158-1161, doi: 10.21437/Interspeech.2007-376
@inproceedings{singhmiller07_interspeech, author={Natasha Singh-Miller and Michael Collins and Timothy J. Hazen}, title={{Dimensionality reduction for speech recognition using neighborhood components analysis}}, year=2007, booktitle={Proc. Interspeech 2007}, pages={1158--1161}, doi={10.21437/Interspeech.2007-376} }