Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

An Objective Measure for Qualitatively Assessing Low-Bit-Rate Coded Speech

Toshiro Watanabe, Shinji Hayashi

NTT Human Interface Laboratories, Tokyo, Japan

This paper reports an objective measure for assessing low-bit-rate coded speech. A model for this objective measure, in which several known features of the perceptual processing of speech sounds by the human ear are emulated, is based on the Hertz-to-Bark transformation, on critical-band filtering with a preemphasis for boosting higher frequencies, on nonlinear conversion for subjective loudness, and on temporal (forward) masking process. The Bark spectral distortion rating (BSDR) is computed for each 10-20 ms segment of the original and coded speech. The effectiveness of this measure was validated by regression analysis between the computed BSDR values and subjective MOS ratings obtained for a large number of utterances coded by several versions of a CELP coder and a VSELP coder under three degraded conditions: input speech levels, transmission error rates, and background noise levels. The BSDR values correspond better to MOS ratings than several commonly used measures. As a result, BSDR can be used to accurately predict subjective scores.

