September 22-25, 1997
This work presents experiments on four segmental training algorithms for mixture density HMMs. The segmental versions of SOM and LVQ3 suggested by the author are compared against the conventional segmental K-means and the segmental GPD. The recognition task used as a test bench is the speaker dependent, but vocabulary independent automatic speech recognition. The output density function of each state in each model is a mixture of multivariate Gaussian densities. Neural network methods SOM and LVQ are applied to learn the parameters of the density models from the mel- cepstrum features of the training samples. The segmental training improves the segmentation and the model parameters by turns to obtain the best possible result, because the segmentation and the segment classification depend on each other. It suffices to start the training process by dividing the training samples approximatively into phoneme samples.
Bibliographic reference. Kurimo, Mikko (1997): "Comparison results for segmental training algorithms for mixture density HMMs", In EUROSPEECH-1997, 87-90.