First International Conference on Spoken Language Processing (ICSLP 90)
This paper describes an extended Elman's recurrent neural network adapted for speech recognition with input context buffers and analog target function. The input layer has context buffers to extract context sensitive features in the input. The analog target function in the output layer reflects the confidence level of the output for the current input in the context buffer. Speaker dependent recognition results for 10 syllables using cepstral coefficients show that the extended Elman's network is superior to the Elman's network as well as Multi-layer Perceptron. The recognition accuracy of the extended Elman's network is better than that of the cepstral distance measure and comparable to that of the weighted cepstral distance measure using dynamic time warping based template matching. Preliminary conclusion is that the input context buffers with time replicated scanning enhance the shift invariant capability of the recurrent neural network.
Bibliographic reference. Cho, Yong Duk / Kim, Ki Chul / Yoon, Hyun Soo / Maeng, Seung Ryoul / Cho, Jung Wan (1990): "Extended elman's recurrent neural network for syllable recognition", In ICSLP-1990, 1057-1060.