5th International Conference on Spoken Language Processing
The equation of neural nets for stereo vision is applied to speech recognition. We use Coupled Pattern Recognition (CPR) equation which has been shown to organize depth perception very well through competition and cooperation. We construct Gaussian probability density function for each phoneme from a number of training data. The input data to be recognized are compared to the pdf's and the similarity measures are obtained for each phoneme. The CPR equation develops neuron activities by receiving the similarity measures as input. A recognition is achieved when the activities arrive at a stable states. The recognition rates for 25 Japanese phoneme are 74.75% in average which is compared to 71.53% Hidden Markov Model. A certain technical improvement is applied to our neuron model, by dividing data of a phoneme into two part, one for the former frames, the other for the latter frames.A remarkable improvement is obtained with average recognition rate of 79.79%.
Bibliographic reference. Kitazoe, Tetsuro / Ichiki, Tomoyuki / Kim, Sung-Ill (1998): "Acoustic speech recognition model by neural net equation with competition and cooperation", In ICSLP-1998, paper 0965.