15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Objective Quality Evaluation of Noise-Suppressed Speech: Effects of Temporal Envelope and Fine-Structure Cues

Fei Chen (1), Yi Hu (2)

(1) University of Hong Kong, China
(2) University of Wisconsin-Milwaukee, USA

While temporal envelope and fine-structure cues are known to be good predictors for speech intelligibility, it is not clear how well they are correlated with subjective quality ratings, particularly those using noise-suppressed speech. The present work evaluated the performance of two objective measures (i.e., NCM and TFSS), which were originally developed with primarily envelope or fine-structure cue as speech intelligibility indices, when they were applied for predicting the subjective quality ratings of noise-suppressed speech along three dimensions of signal distortion, noise distortion and overall quality. We considered a wide range of distortion introduced by four types of real-world noises at two signal-to-noise-ratio levels and by four classes of noise-suppression algorithms. This work finds that the present envelope- and fine-structure-based measures poorly predict the subjective quality ratings of noise-suppressed speech. The PESQ measure is so far the best choice in terms of objectively evaluating both subjective quality ratings and intelligibility scores of noise-suppressed speech.

