Anomaly Detection Approach for Pronunciation Verification of Disordered Speech Using Speech Attribute Features

Mostafa Shahin, Beena Ahmed, Jim X. Ji, Kirrie Ballard


The automatic assessment of speech is a powerful tool in computer aided speech therapy for disorders such as Childhood Apraxia of Speech (CAS). However, the lack of sufficient annotated disordered speech data seriously impedes the accurate detection of pronunciation errors. To handle this deficiency, in this paper, we used the novel approach of tackling pronunciation verification as an anomaly detection problem. We achieved this by modeling only the correct pronunciation of each individual phoneme with a one-class Support Vector Machine (SVM) trained using a set of speech attributes features, namely the manner and place of articulation. These features are extracted from a bank of pre-trained Deep Neural Network (DNN) speech attributes classifiers. The one-class SVM model classifies each phoneme production as normal (correct) or an anomaly (incorrect). We evaluated the system using both native speech with artificial errors and disordered speech collected from children with apraxia of speech and compared it with the DNN Goodness of Pronunciation (GOP) algorithm. The results show that our approach reduces the false-rejection rates by around 35% when applied to disordered speech.


 DOI: 10.21437/Interspeech.2018-1319

Cite as: Shahin, M., Ahmed, B., Ji, J.X., Ballard, K. (2018) Anomaly Detection Approach for Pronunciation Verification of Disordered Speech Using Speech Attribute Features. Proc. Interspeech 2018, 1671-1675, DOI: 10.21437/Interspeech.2018-1319.


@inproceedings{Shahin2018,
  author={Mostafa Shahin and Beena Ahmed and Jim X. Ji and Kirrie Ballard},
  title={Anomaly Detection Approach for Pronunciation Verification of Disordered Speech Using Speech Attribute Features},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={1671--1675},
  doi={10.21437/Interspeech.2018-1319},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1319}
}