Detection of Mispronunciations and Disfluencies in Children Reading Aloud

Jorge Proença, Carla Lopes, Michael Tjalve, Andreas Stolcke, Sara Candeias, Fernando Perdigão


To automatically evaluate the performance of children reading aloud or to follow a child’s reading in reading tutor applications, different types of reading disfluencies and mispronunciations must be accounted for. In this work, we aim to detect most of these disfluencies in sentence and pseudoword reading. Detecting incorrectly pronounced words, and quantifying the quality of word pronunciations, is arguably the hardest task. We approach the challenge as a two-step process. First, a segmentation using task-specific lattices is performed, while detecting repetitions and false starts and providing candidate segments for words. Then, candidates are classified as mispronounced or not, using multiple features derived from likelihood ratios based on phone decoding and forced alignment, as well as additional meta-information about the word. Several classifiers were explored (linear fit, neural networks, support vector machines) and trained after a feature selection stage to avoid overfitting. Improved results are obtained using feature combination compared to using only the log likelihood ratio of the reference word (22% versus 27% miss rate at constant 5% false alarm rate).


 DOI: 10.21437/Interspeech.2017-1522

Cite as: Proença, J., Lopes, C., Tjalve, M., Stolcke, A., Candeias, S., Perdigão, F. (2017) Detection of Mispronunciations and Disfluencies in Children Reading Aloud. Proc. Interspeech 2017, 1437-1441, DOI: 10.21437/Interspeech.2017-1522.


@inproceedings{Proença2017,
  author={Jorge Proença and Carla Lopes and Michael Tjalve and Andreas Stolcke and Sara Candeias and Fernando Perdigão},
  title={Detection of Mispronunciations and Disfluencies in Children Reading Aloud},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={1437--1441},
  doi={10.21437/Interspeech.2017-1522},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1522}
}