8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Measuring the Perceived Importance of Time- and Frequency-divided Speech Blocks for Transmitting over Packet Networks

Akitoshi Kataoka, Yusuke Hiwasaki, Toru Morinaga, Jotaro Ikedo

NTT Corporation, Japan

This paper presents a way to calculate the perceived importance of speech segments as a single value criterion, using a linear regression model. Unlike the commonly used voice activity detection (VAD) algorithms, this method allows us to obtain a finer priority granularity of speech segments. This can be used in conjunction with frequency scalable speech coding techniques and IP QoS techniques to achieve efficient and quality-controlled voice transmission. A simple linear regression model is used to calculate the estimated mean opinion score (MOS) of the various cases of missing speech segments.

