5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Subspace Distribution Clustering for Continuous Observation Density Hidden Markov Models

Enrico Bocchieri (1), Brian Mak (2)

(1) AT&T Labs-Research, Florham Park, NJ 07932, USA
(2) Oregon Graduate Institute, 20000 NW Walker Rd, Portland, OR, USA

This paper presents an efficient approximation of the Gaussian mixture state probability density functions of continuous observation density hidden Markov models (CHMM 's). In CHMM 's, the Gaussian mixtures carry a high computational cost, which amounts to a significant fraction (e.g. 30% to 70%) of the total computation. To achieve higher computation and memory efficiency, we approximate the Gaussian mixtures by (a) decomposition into functions defined on subspaces of the feature space, and (b) clustering the resulting subspace pdf's. Intuitively, when clustering in a subspace of few dimensions, even few function codewords can provide a small distortion. Therefore, we obtain significant reduction of the total computation (up to a factor of two), and memory savings (up to a factor of twelve), without significant changes of the CHMMM's accuracy.

Full Paper

Bibliographic reference.  Bocchieri, Enrico / Mak, Brian (1997): "Subspace distribution clustering for continuous observation density hidden Markov models", In EUROSPEECH-1997, 107-110.