8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Model-Integration Rapid Training Based on Maximum Likelihood for Speech Recognition

Shinichi Yoshizawa (1), Kiyohiro Shikano (2)

(1) Matsushita Electric Industrial Co. Ltd., Japan
(2) Nara Institute of Science and Technology, Japan

Speech recognition technology has been widely used. Considering a training cost of an acoustic model, it is beneficial to reuse pre-existing acoustic models for making a suitable one for various apparatus and application. However, a complex acoustic model for high CPU power does not work for low CPU power. And a simple model for fast-processing-demanded application does not work well for high-precision-demanded ones. Therefore, it is important to adjust a model complexity according to apparatus or application, such as a number of mixture of Gaussians. This paper describes a new model-integration-type of training for obtaining a required number of mixture of Gaussians. This training can alter a number of mixture into a required one according to a specification of apparatus or application. We propose a model integration rapid training based on maximum likelihood, and evaluate the recognition performance successfully.

Full Paper

Bibliographic reference.  Yoshizawa, Shinichi / Shikano, Kiyohiro (2003): "Model-integration rapid training based on maximum likelihood for speech recognition", In EUROSPEECH-2003, 2621-2624.