5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Effects of Phonetic Quality and Duration on Perceptual Acceptability of Temporal Changes in Speech

Hiroaki Kato (1), Minoru Tsuzaki (1), Yoshinori Sagisaka (2)

(1) ATR HIP, Japan
(2) ATR ITL, Japan

To establish a perceptually valid rule for the durational control of synthetic speech, it is necessary to know the degree to which a given temporal error or distortion is acceptable to human listeners. Two perceptual experiments were conducted to estimate the acceptability of modifications in either vocalic or consonantal durations as a function of two attributes of the modified portions, i.e., the phonetic quality and the original (unmodified) duration. The results showed that the listeners' acceptable modification ranges were narrowest for vowels, and widest for voiceless fricatives and silent closures, with nasals in between. They were also narrower for those portions with shorter base durations. The effect of the original duration was larger for the vowel stimuli than for the voiceless fricative stimuli. The perceptual mechanism mediating these results is discussed with regard to the dependency of the listeners' temporal sensitivity on the stimulus loudness and base duration. [Re: http://www.hip.atr.co.jp/~kato/single_duration/]

