A Lightly Supervised Approach to Detect Stuttering in Children's Speech

Sadeen Alharbi, Madina Hasan, Anthony J H Simons, Shelagh Brumfitt, Phil Green


In speech pathology, new assistive technologies using ASR and machine learning approaches are being developed for detecting speech disorder events. Classically-trained ASR model tends to remove disfluencies from spoken utterances, due to its focus on producing clean and readable text output. However, diagnostic systems need to be able to track speech disfluencies, such as stuttering events, in order to determine the severity level of stuttering. To achieve this, ASR systems must be adapted to recognise full verbatim utterances, including pseudo-words and non-meaningful part-words. This work proposes a training regime to address this problem and preserve a full verbatim output of stuttering speech. We use a lightly-supervised approach using task-oriented lattices to recognise the stuttering speech of children performing a standard reading task. This approach improved the WER by 27.8% relative to a baseline that uses word-lattices generated from the original prompt. The improved results preserved 63% of stuttering events (including sound, word, part-word and phrase repetition and revision). This work also proposes a separate correction layer on top of the ASR that detects prolongation events (which are poorly recognised by the ASR). This increases the percentage of preserved stuttering events to 70%.


 DOI: 10.21437/Interspeech.2018-2155

Cite as: Alharbi, S., Hasan, M., J H Simons, A., Brumfitt, S., Green, P. (2018) A Lightly Supervised Approach to Detect Stuttering in Children's Speech. Proc. Interspeech 2018, 3433-3437, DOI: 10.21437/Interspeech.2018-2155.


@inproceedings{Alharbi2018,
  author={Sadeen Alharbi and Madina Hasan and Anthony {J H Simons} and Shelagh Brumfitt and Phil Green},
  title={A Lightly Supervised Approach to Detect Stuttering in Children's Speech},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={3433--3437},
  doi={10.21437/Interspeech.2018-2155},
  url={http://dx.doi.org/10.21437/Interspeech.2018-2155}
}