Facial (e.g., lips and jaw) movements can provide important information for the assessment, diagnosis and treatment of motor speech disorders. However, due to the high costs of the instrumentation used to record speech movements, such information is typically limited to research studies. With the recent development of depth sensors and efficient algorithms for facial tracking, clinical applications of this technology may be possible. Although lip tracking methods have been validated in the past, jaw tracking remains a challenge. In this study, we assessed the accuracy of tracking jaw movements with a video-based system composed of a face tracker and a depth sensor, specifically developed for short range applications (Intel RealSense SR300). The assessment was performed on healthy subjects during speech and non-speech tasks. Preliminary results showed that jaw movements can be tracked with reasonable accuracy (RMSE≈2mm), with better performance for slow movements. Further tests are needed in order to improve the performance of these systems and develop accurate methodologies that can reveal subtle changes in jaw movements for the assessment and treatment of motor speech disorders.
Cite as: Bandini, A., Namasivayam, A., Yunusova, Y. (2017) Video-Based Tracking of Jaw Movements During Speech: Preliminary Results and Future Directions. Proc. Interspeech 2017, 689-693, doi: 10.21437/Interspeech.2017-1371
@inproceedings{bandini17_interspeech, author={Andrea Bandini and Aravind Namasivayam and Yana Yunusova}, title={{Video-Based Tracking of Jaw Movements During Speech: Preliminary Results and Future Directions}}, year=2017, booktitle={Proc. Interspeech 2017}, pages={689--693}, doi={10.21437/Interspeech.2017-1371} }