ISCA Archive Interspeech 2017
ISCA Archive Interspeech 2017

A Comparison of Sentence-Level Speech Intelligibility Metrics

Alexander Kain, Max Del Giudice, Kris Tjaden

We examine existing and novel automatically-derived acoustic metrics that are predictive of speech intelligibility. We hypothesize that the degree of variability in feature space is correlated with the extent of a speaker’s phonemic inventory, their degree of articulatory displacements, and thus with their degree of perceived speech intelligibility. We begin by using fully-automatic F1/F2 formant frequency trajectories for both vowel space area calculation and as input to a proposed class-separability metric. We then switch to representing vowels by means of short-term spectral features, and measure vowel separability in that space. Finally, we consider the case where phonetic labeling is unavailable; here we calculate short-term spectral features for the entire speech utterance and then estimate their entropy based on the length of a minimum spanning tree. In an alternative approach, we propose to first segment the speech signal using a hidden Markov model, and then calculate spectral feature separability based on the automatically-derived classes. We apply all approaches to a database with healthy controls as well as speakers with mild dysarthria, and report the resulting coefficients of determination.

doi: 10.21437/Interspeech.2017-567

Cite as: Kain, A., Giudice, M.D., Tjaden, K. (2017) A Comparison of Sentence-Level Speech Intelligibility Metrics. Proc. Interspeech 2017, 1148-1152, doi: 10.21437/Interspeech.2017-567

  author={Alexander Kain and Max Del Giudice and Kris Tjaden},
  title={{A Comparison of Sentence-Level Speech Intelligibility Metrics}},
  booktitle={Proc. Interspeech 2017},