In this paper the design of semi-continuous segmental probability models (SCSPMs) in large vocabulary continuous speech recognition is presented. The tied Gaussian densities are trained using data from all states of all utterances while the mixture weights are estimated using data from the state being trained individually. The SCSPMs tie all the densities of all states from all Speech Recognition Units (SRUs) to form a shared pdf codebook, thus the number of Gaussian densities is greatly reduced. Several pruning methods are reviewed and then a new pruning criterion is proposed in order to reduce the number of tied mixture Gaussian densities while there is only a small subset of mixture Gaussian densities with larger tying weights. Our preliminary experiments show that the SCSPM incorporated with the pruning techniques can lessen the size of model storage and speed up the system with little degradation in the accuracy compared to the prior continuous model.
Cite as: Zhang, J., Zheng, F., Xu, M., Fang, D. (2000) Semi-continuous segmental probability modeling for continuous speech recognition. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 278-281, doi: 10.21437/ICSLP.2000-69
@inproceedings{zhang00c_icslp, author={Jiyong Zhang and Fang Zheng and Mingxing Xu and Ditang Fang}, title={{Semi-continuous segmental probability modeling for continuous speech recognition}}, year=2000, booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)}, pages={vol. 1, 278-281}, doi={10.21437/ICSLP.2000-69} }