While temporal envelope and fine-structure cues are known to be good predictors for speech intelligibility, it is not clear how well they are correlated with subjective quality ratings, particularly those using noise-suppressed speech. The present work evaluated the performance of two objective measures (i.e., NCM and TFSS), which were originally developed with primarily envelope or fine-structure cue as speech intelligibility indices, when they were applied for predicting the subjective quality ratings of noise-suppressed speech along three dimensions of signal distortion, noise distortion and overall quality. We considered a wide range of distortion introduced by four types of real-world noises at two signal-to-noise-ratio levels and by four classes of noise-suppression algorithms. This work finds that the present envelope- and fine-structure-based measures poorly predict the subjective quality ratings of noise-suppressed speech. The PESQ measure is so far the best choice in terms of objectively evaluating both subjective quality ratings and intelligibility scores of noise-suppressed speech.
Bibliographic reference. Chen, Fei / Hu, Yi (2014): "Objective quality evaluation of noise-suppressed speech: effects of temporal envelope and fine-structure cues", In INTERSPEECH-2014, 2055-2058.