Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Minimum Error Rate Training of Inter-Word Context Dependent Acoustic Model Units in Speech Recognition

W. Chou, C.-E. Lee, Biing-Hwang Juang

AT&T Bell Laboratories, Murray Hill, NJ, USA

In this paper, we study the issues related to string level acoustic modeling in continuous speech recognition. A new approach based on the minimum string error rate criterion is proposed to the training of inter-word context dependent acoustic model units. Under the proposed approach, the inter-word context dependent acoustic model units are modeled at the global string level by directly applying the minimum string error rate based discriminative analysis to string level acoustic model matching. Experimental results indicate that a significant error rate reduction can be achieved through the proposed approach. Based on the proposed approach, the best performance obtained by a gender-independent model on the TI connected digit corpus is 0.24% word error rate and 0.72% string error rate.

Full Paper

Bibliographic reference.  Chou, W. / Lee, C.-E. / Juang, Biing-Hwang (1994): "Minimum error rate training of inter-word context dependent acoustic model units in speech recognition", In ICSLP-1994, 439-442.