Third International Conference on Spoken Language Processing (ICSLP 94)
In this paper, we study the issues related to string level acoustic modeling in continuous speech recognition. A new approach based on the minimum string error rate criterion is proposed to the training of inter-word context dependent acoustic model units. Under the proposed approach, the inter-word context dependent acoustic model units are modeled at the global string level by directly applying the minimum string error rate based discriminative analysis to string level acoustic model matching. Experimental results indicate that a significant error rate reduction can be achieved through the proposed approach. Based on the proposed approach, the best performance obtained by a gender-independent model on the TI connected digit corpus is 0.24% word error rate and 0.72% string error rate.
Bibliographic reference. Chou, W. / Lee, C.-E. / Juang, Biing-Hwang (1994): "Minimum error rate training of inter-word context dependent acoustic model units in speech recognition", In ICSLP-1994, 439-442.