Experiments investigated the effects of training set size and diversity of speech data in training an HMM-based, speaker-independent, continuous Japanese speech recognition system. Two different types of diversity were investigated: speaker diversity and phonetic diversity. The results indicate that greater amounts of training data improve recognition performance and that, given a fixed amount of training data, greater diversity of training materials both in terms of speakers and phonetic contexts improve recognition performance.
Cite as: Shirotsuka, O., Kawai, G., Cohen, M., Bernstein, J. (1992) Performance of speaker-independent Japanese recognizer as a function of training set size and diversity. Proc. 2nd International Conference on Spoken Language Processing (ICSLP 1992), 297-300, doi: 10.21437/ICSLP.1992-65
@inproceedings{shirotsuka92_icslp, author={O. Shirotsuka and G. Kawai and Michael Cohen and J. Bernstein}, title={{Performance of speaker-independent Japanese recognizer as a function of training set size and diversity}}, year=1992, booktitle={Proc. 2nd International Conference on Spoken Language Processing (ICSLP 1992)}, pages={297--300}, doi={10.21437/ICSLP.1992-65} }