First International Conference on Spoken Language Processing (ICSLP 90)
The Neural Prediction Model (NPM) is a speech recognition model designed to consider the correlation between spectral and temporal structures in the speech patterns, which are important for accurate speech recognition. This paper presents an improvement in the model and its application to large vocabulary speech recognition, based on subword units. The improvement involves an introduction of "backward prediction," which further improves the prediction accuracy of the original model with only "forward prediction". In application of the model to large vocabulary speech recognition, the demi-syllable unit is used as a subword recognition unit. Speaker dependent large vocabulary speech recognition experiments were carried out. The training data amount, necessary for model parameter estimation, and the input layer configuration for pattern predictors were examined. As the best result, a 94.8% recognition accuracy for a 5000 word test set was obtained and the effectiveness was confirmed for the proposed model improvement and the demi-syllable units.
Bibliographic reference. Iso, Ken-ichi / Watanabe, Takao (1990): "Speech recognition using demi-syllable neural prediction model", In ICSLP-1990, 661-664.