This paper describes a set of experiments on neural-network training and search techniques that, when combined, have resulted in a 54% reduction in error on the continuous digits recognition task. The best system had word-level accuracy of 97.52% on a test set of the OGI 30K Numbers corpus, which contains naturally-produced continuous digit strings recorded over telephone channels. Experiments investigated effects of the feature set, the amount of data used for training, the type of context-dependent categories to be recognized, the values for duration limits, and the type of grammar. The experiments indicate that the grammar and duration limits had a greater effect on recognition accuracy than the output categories, cepstral features, or a 50% increase in the amount of training data.
Cite as: Hosom, J.-P., Cole, R.A., Cosi, P. (1998) Evaluation and integration of neural-network training techniques for continuous digit recognition. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0613, doi: 10.21437/ICSLP.1998-401
@inproceedings{hosom98_icslp, author={John-Paul Hosom and Ronald A. Cole and Piero Cosi}, title={{Evaluation and integration of neural-network training techniques for continuous digit recognition}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0613}, doi={10.21437/ICSLP.1998-401} }