Speech Intelligibility Prediction Based on the Envelope Power Spectrum Model with the Dynamic Compressive Gammachirp Auditory Filterbank

Katsuhiko Yamamoto, Toshio Irino, Toshie Matsui, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani


In this study, we develop a new method to realize speech intelligibility prediction of synthetic sounds processed by nonlinear speech enhancement algorithms. A speech envelope power spectrum model (sEPSM) was proposed to account for subjective results on a spectral subtraction, but it is untested by recent state-of-the-art speech enhancement algorithms. We introduce a dynamic compressive gammachirp auditory filterbank as the front-end of the sEPSM (dcGC-sEPSM) to improve the predictability. We perform subjective experiments on speech intelligibility (SI) of noise-reduced sounds processed by the spectral subtraction and a recently developed Wiener filter algorithm. We compare the subjective SI scores with the objective SI scores predicted by the proposed dcGC-sEPSM, the original GT-sEPSM, the three-level coherence SII (CSII), and the short-time objective intelligibility (STOI). The results show that the proposed dcGC-sEPSM performs better than the conventional models.


DOI: 10.21437/Interspeech.2016-652

Cite as

Yamamoto, K., Irino, T., Matsui, T., Araki, S., Kinoshita, K., Nakatani, T. (2016) Speech Intelligibility Prediction Based on the Envelope Power Spectrum Model with the Dynamic Compressive Gammachirp Auditory Filterbank. Proc. Interspeech 2016, 2885-2889.

Bibtex
@inproceedings{Yamamoto+2016,
author={Katsuhiko Yamamoto and Toshio Irino and Toshie Matsui and Shoko Araki and Keisuke Kinoshita and Tomohiro Nakatani},
title={Speech Intelligibility Prediction Based on the Envelope Power Spectrum Model with the Dynamic Compressive Gammachirp Auditory Filterbank},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-652},
url={http://dx.doi.org/10.21437/Interspeech.2016-652},
pages={2885--2889}
}