Second ISCA/DEGA Tutorial and Research Workshop on Perceptual Quality of Systems
A non-intrusive method for speech quality assessment in telephony applications is proposed and its performance evaluated. The method involves measuring perception-based objective auditory distances between the voiced parts of the processed (degraded) speech signal to appropriately matching references extracted from a pre-formulated codebook. The codebook is formed by optimally clustering large number of parametric speech vectors extracted from a database of clean speech records, using an efficient data-mining tool known as the Self-Organizing Map (SOM). The auditory distances are mapped into an objective Mean Opinion listening quality scores (MOS_LQO). Three domain transformation techniques have been utilized to provide perception-based, speakerindependent parametric representation of the speech: a Perceptual Linear Prediction (PLP) model, a Bark Spectrum (BS) analysis and Mel-Frequency Cepstrum Coefficients (MFCC). Reported evaluation results show that the proposed method provides high correlation with subjective listening quality scores (MOS_LQS), yielding accuracy similar to that of the ITU-T P.563 while maintaining a relatively low computational complexity. Results also demonstrate that the method outperforms the PESQ in assessing the quality of speech degraded by channel impairments.
Bibliographic reference. Mahdi, Abdulhussain E. (2006): "Non-intrusive SOM-based speech quality assessment for telephony applications", In PQS-2006, 123-130.