8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Comparison of Subspace Methods for Gaussian Mixture Models in Speech Recognition

Matti Varjokallio, Mikko Kurimo

Helsinki University of Technology, Finland

Speech recognizers typically use high-dimensional feature vectors to capture the essential cues for speech recognition purposes. The acoustics are then commonly modeled with a Hidden Markov Model with Gaussian Mixture Models as observation probability density functions. Using unrestricted Gaussian parameters might lead to intolerable model costs both evaluation- and storagewise, which limits their practical use only to some high-end systems. The classical approach to tackle with these problems is to assume independent features and constrain the covariance matrices to being diagonal. This can be thought as constraining the second order parameters to lie in a fixed subspace consisting of rank-1 terms. In this paper we discuss the differences between recently proposed subspace methods for GMMs with emphasis placed on the applicability of the models to a practical LVCSR system.

Full Paper

Bibliographic reference.  Varjokallio, Matti / Kurimo, Mikko (2007): "Comparison of subspace methods for Gaussian mixture models in speech recognition", In INTERSPEECH-2007, 2121-2124.