The Use of Machine Learning and Phonetic Endophenotypes to Discover Genetic Variants Associated with Speech Sound Disorder

Jason Lilley, Erin Crowgey, H Timothy Bunnell


Thirty-four (34) children with reported speech sound disorders (SSD) were recruited for a prior study, as well as 31 of their siblings, many of whom also showed SSD. Using data-clustering techniques, we assigned each child to one or more endophenotypes defined by the number and type of speech errors made on the GFTA-2. The genetic samples of 53 of the participants underwent whole exome sequencing. Variant alleles were detected, filtered and annotated from the sequences and the data were filtered using quality checks, annotations and phenotypes. We then used Random Forest classification to search for associations between variants and endophenotypes. In this preliminary report, we highlight one promising association with a common variant of COMT, a dopamine metabolizer in the brain.


 DOI: 10.21437/Interspeech.2018-2398

Cite as: Lilley, J., Crowgey, E., Bunnell, H.T. (2018) The Use of Machine Learning and Phonetic Endophenotypes to Discover Genetic Variants Associated with Speech Sound Disorder. Proc. Interspeech 2018, 461-465, DOI: 10.21437/Interspeech.2018-2398.


@inproceedings{Lilley2018,
  author={Jason Lilley and Erin Crowgey and H Timothy Bunnell},
  title={The Use of Machine Learning and Phonetic Endophenotypes to Discover Genetic Variants Associated with Speech Sound Disorder},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={461--465},
  doi={10.21437/Interspeech.2018-2398},
  url={http://dx.doi.org/10.21437/Interspeech.2018-2398}
}