5th International Conference on Spoken Language Processing
Many HMM-based recognition systems use mixtures of diagonal covariance gaussians to model the observation density functions in the states. These mixtures are however only approximations of the real distributions. One of the approximations is the assumption that the off-diagonal elements of the covariance matrices of the gaussians are close to zero (diagonal covariance). To that end, most recognition systems have some kind of parameter decorrelation near the end of the preprocessing, e.g. the inverse cosine transform used with cepstral transformations. These transforms are however not optimal if it comes to decorrelating features on the gaussian level. This paper presents an optimal solution in a least-square sense to the decorrelation problem. It also demonstrates the link between the recently published maximum likelihood modelling for semi-tied covariance matrices and the presented least-squares optimisation. Evaluation on a large vocabulary recognition task shows a 10% relative improvement.
Bibliographic reference. Demuynck, Kris / Duchateau, Jacques / Compernolle, Dirk Van / Wambacq, Patrick (1998): "Improved feature decorrelation for HMM-based speech recognition", In ICSLP-1998, paper 1081.