ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

Automatic Extraction of Speech Rhythm Descriptors for Speech Intelligibility Assessment in the Context of Head and Neck Cancers

Robin Vaysse, Jérôme Farinas, Corine Astésano, Régine André-Obrecht

The temporal dimension of speech acoustics is rarely taken into account in automatic models for Speech Intelligibility evaluation, although the rhythmic recurrence of phonemes, syllables and prosodic groups are allegedly good predictors of speech intelligibility. The present study aims at unravelling those automatic parameters that best account for the different levels of the speech signal’s rhythmic structure, and to evaluate their correlation with a perceptual intelligibility measure. The parameters are extracted from the Fourier Transform of the amplitude modulation of the signal (Envelope Modulation Spectrum) [1, 2]. A Lasso linear model for feature selection is first implemented to select the most relevant parameters, and a SVR regression analysis is run to reveal the best parameters’ combination. Our analyses of EMS, using data from the French corpora of cancer speech C2SI [3], show strong performances of the automatic prediction, with a correlation of 0.70 between our model and an intelligibility evaluation score by speech-pathologists. In particular, the highest correlation with speech intelligibility lies in the ratio between the energy in the low frequency band (0.5–4 Hz that represents slow rhythmic modulations indicative of prosodic groups) and in the higher one (4–10 Hz that represents fast rhythmic modulations like phonemes).


doi: 10.21437/Interspeech.2021-1736

Cite as: Vaysse, R., Farinas, J., Astésano, C., André-Obrecht, R. (2021) Automatic Extraction of Speech Rhythm Descriptors for Speech Intelligibility Assessment in the Context of Head and Neck Cancers. Proc. Interspeech 2021, 1912-1916, doi: 10.21437/Interspeech.2021-1736

@inproceedings{vaysse21_interspeech,
  author={Robin Vaysse and Jérôme Farinas and Corine Astésano and Régine André-Obrecht},
  title={{Automatic Extraction of Speech Rhythm Descriptors for Speech Intelligibility Assessment in the Context of Head and Neck Cancers}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={1912--1916},
  doi={10.21437/Interspeech.2021-1736}
}