Automated Classification of Vowel-Gesture Parameters Using External Broadband Excitation

Balamurali B T, Jer-Ming Chen


External broadband signal excitation applied at the speaker (or singer)’s mouth has previously been successfully used to estimate acoustic resonances of the vocal tract during speaking and singing. In this study, we used a modified, low cost, light-weight, pocket-sized and simplified version of this measurement technique, with reduced sampling time and improved low frequency detection, so that such vocal tract measurements may be easily deployed ‘in the field’ and facilitate a more ‘ecological/natural’ tracking of phonatory gestures. This system was investigated with 6 volunteer speakers phonating 17 English vowels and the relative impedance spectrum γ (‘gamma’) was measured. Although the γ(f) signal measured here for each phonatory gesture is somewhat noisier than the original technique, it is still believed to carry some important cues associated with vocal tract configuration that produce these vowels. Features were identified both in the amplitude and phase of γ(f) and three ensemble classifiers namely random forest, gradient boosting and adaboost were trained using them. The prediction output from these classifiers were combined using soft voting to predict a class label (front-central-back; open-close). This yielded an accuracy exceeding 80% in classifying the six nominal regions of the vowel plane.


 DOI: 10.21437/Interspeech.2018-1756

Cite as: B T, B., Chen, J. (2018) Automated Classification of Vowel-Gesture Parameters Using External Broadband Excitation. Proc. Interspeech 2018, 2315-2318, DOI: 10.21437/Interspeech.2018-1756.


@inproceedings{B T2018,
  author={Balamurali {B T} and Jer-Ming Chen},
  title={Automated Classification of Vowel-Gesture Parameters Using External Broadband Excitation},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={2315--2318},
  doi={10.21437/Interspeech.2018-1756},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1756}
}