 |
Sixth International Conference on Spoken Language Processing (ICSLP 2000)
Beijing, China
October 16-20, 2000 |
 |
Semi-Continuous Segmental Probability Modeling for Continuous Speech Recognition
Jiyong Zhang, Fang Zheng, Mingxing Xu, Ditang Fang
Center of Speech Technology, State Key Laboratory of Intelligent Technology and Systems,
Department of Computer Science & Technology, Tsinghua University, Beijing, China
In this paper the design of semi-continuous segmental
probability models (SCSPMs) in large vocabulary continuous
speech recognition is presented. The tied Gaussian densities are
trained using data from all states of all utterances while the
mixture weights are estimated using data from the state being
trained individually. The SCSPMs tie all the densities of all
states from all Speech Recognition Units (SRUs) to form a
shared pdf codebook, thus the number of Gaussian densities is
greatly reduced. Several pruning methods are reviewed and then
a new pruning criterion is proposed in order to reduce the
number of tied mixture Gaussian densities while there is only a
small subset of mixture Gaussian densities with larger tying
weights. Our preliminary experiments show that the SCSPM
incorporated with the pruning techniques can lessen the size of
model storage and speed up the system with little degradation in
the accuracy compared to the prior continuous model.
Full Paper
Bibliographic reference.
Zhang, Jiyong / Zheng, Fang / Xu, Mingxing / Fang, Ditang (2000):
"Semi-continuous segmental probability modeling for continuous speech recognition",
In ICSLP-2000, vol.1, 278-281.