Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

The Effects of Speaker Training on ASR Accuracy

Stephen Anderson, Natalie Liberman, Larry Gillick, Stephen Foster, Sahoko Hama

Dragon Systems, Newton, MA, USA

Do experienced speech recognition users achieve high accuracy rates because their systems have taught them successful speaking styles? We report an experiment to quantify this "speaker training" effect. In our experiment, 30 computer-literate elderly speakers (15 male, 15 female) with no previous ASR experience were given 2 hours of intensive training in using a speech recognition system. Before and after this training session, they were asked to read separate 520-word texts. Measuring the word error rates (WERs) on these "before training" and "after training" recordings, we find a small but statistically significant improvement. Before training, speakers had an average WER of 20.9%, and after training, 19.8%. We examine changes in speaking rate, phrase length, and SNR and their impact on WER. This improvement is surprisingly small; anecdotal evidence suggests that experienced ASR users have substantially higher accuracy than novices. The effect may be larger for more extensive training.

