8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Speech Quality Estimation Using Gaussian Mixture Models

Tiago Falk (1), Wai-Yip Chan (1), Peter Kabal (2)

(1) Queen's University, Canada
(2) McGill University, Canada

We propose a novel method to estimate the quality of coded speech signals. The joint probability distribution of the subjective mean opinion score (MOS) and perceptual distortion feature variables is modelled using a Gaussian mixture density. The feature variables are sifted from a large pool of candidate features using statistical data mining techniques. We study what combinations of features and mixture model configuration are most effective. For our speech database, a five-feature, three-component GMM furnishes approximately 18% lower root-mean-squared MOS estimation error than ITU-T P.862 PESQ, the current best standard algorithm.

Full Paper

Bibliographic reference.  Falk, Tiago / Chan, Wai-Yip / Kabal, Peter (2004): "Speech quality estimation using Gaussian mixture models", In INTERSPEECH-2004, 2013-2016.