7th International Conference on Spoken Language Processing
September 16-20, 2002
This paper explores the problem of predicting specific reading mistakes, called miscues, on a given word. Characterizing likely miscues tells an automated reading tutor what to anticipate, detect, and remediate. As training and test data, we use a database of over 100,000 miscues transcribed by University of Colorado researchers. We explore approaches that exploit different sources of predictive power: the uneven distribution of words in text, and the fact that most miscues are real words. We compare the approaches’ ability to predict miscues of other readers on other text. A simple rote method does best on the most frequent 100 words of English, while an extrapolative method for predicting real-word miscues performs well on less frequent words, including words not in the training data.
Bibliographic reference. Mostow, Jack / Beck, Joseph / Winter, S. Vanessa / Wang, Shaojun / Tobin, Brian (2002): "Predicting oral reading miscues", In ICSLP-2002, 1221-1224.