5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Training of Context-Dependent Subspace Distribution Clustering Hidden Markov Model

Brian Mak (1), Enrico Bocchieri (2)

(1) Department of Computer Science, The Hong Kong University of Science & Technology, China
(2) AT&T Labs -- Research, USA

Training of continuous density hidden Markov models(CDHMMs) is usually time-consuming and tedious due to the large number of model parameters involved. Recently we proposed a new derivative of CDHMM, the subspace distribution clustering hidden Markov model(SDCHMM) which tie CDHMMs at the finer level of subspace distributions, resulting in many fewer model parameters. An SDCHMM training algorithm is also devised to train SDCHMMs directly from speech data without intermediate CDHMMs. On the ATIS task, speaker-independent context-independent(CI) SDCHMMs can be trained with as little as 8 minutes of speech with no loss in recognition accuracy --- a 25-fold reduction when compared with their CDHMM counterparts. In this paper, we extend our novel SDCHMM training to context-dependent(CD) modeling with the assumption of various prior knowledge. Despite the 30-fold increase of model parameters in the CD ATIS CDHMMs, their equivalent CD SDCHMMs can still be estimated with a few minutes of ATIS data.

Full Paper

Bibliographic reference.  Mak, Brian / Bocchieri, Enrico (1998): "Training of context-dependent subspace distribution clustering hidden Markov model", In ICSLP-1998, paper 0699.