Reading skills of children can be improved with the help of automatic reading tutors (ART), i.e. interactive software with an appealing interface which supports and challenges the child in the reading task, provides instantaneous feedback and automatically assesses its reading skills. For this purpose, ARTs benefit from automatic speech recognition technology for tracking the child's responses and detecting reading miscues (errors). In previous work, a novel speech recognition architecture has been proposed which adopts a two-layered structure: first a phone recognizer uses task-independent acoustic and language models to generate a phone lattice which is then decoded using a lexicon of expected words and task-dependent finite state grammars. This approach has shown significant improvements in reading miscue detection. In this paper, we extend this technique by employing a more flexible decoding scheme that allows substitution, deletion and insertion of phones. Specifically, the phone lattice generated in the first layer is extended based on a phone confusion matrix that models the typical phone confusions in a language. The proposed system has provided improved miscue detection on the CHOREC database compared to a baseline system without a phone confusion model.
Bibliographic reference. Yılmaz, Emre / Pelemans, Joris / Van hamme, Hugo (2014): "Automatic assessment of children's reading with the FLaVoR decoding using a phone confusion model", In INTERSPEECH-2014, 969-972.