Sixth International Conference on Spoken Language Processing (ICSLP 2000)

Beijing, China
October 16-20, 2000

Semi-Continuous Segmental Probability Modeling for Continuous Speech Recognition

Jiyong Zhang, Fang Zheng, Mingxing Xu, Ditang Fang

Center of Speech Technology, State Key Laboratory of Intelligent Technology and Systems, Department of Computer Science & Technology, Tsinghua University, Beijing, China

In this paper the design of semi-continuous segmental probability models (SCSPMs) in large vocabulary continuous speech recognition is presented. The tied Gaussian densities are trained using data from all states of all utterances while the mixture weights are estimated using data from the state being trained individually. The SCSPMs tie all the densities of all states from all Speech Recognition Units (SRUs) to form a shared pdf codebook, thus the number of Gaussian densities is greatly reduced. Several pruning methods are reviewed and then a new pruning criterion is proposed in order to reduce the number of tied mixture Gaussian densities while there is only a small subset of mixture Gaussian densities with larger tying weights. Our preliminary experiments show that the SCSPM incorporated with the pruning techniques can lessen the size of model storage and speed up the system with little degradation in the accuracy compared to the prior continuous model.


Full Paper

Bibliographic reference.  Zhang, Jiyong / Zheng, Fang / Xu, Mingxing / Fang, Ditang (2000): "Semi-continuous segmental probability modeling for continuous speech recognition", In ICSLP-2000, vol.1, 278-281.