Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

An Analysis of Voice Quality Using Sinusoidal Model

Naotoshi Osaka

Information Science Research Laboratory, NTT Basic Research Laboratories, Kanagawa, Japan

Voice quality control technology is useful for flexible speech synthesis systems, such as expressive speech synthesis and voice individuality control. So far, most voice quality analysis has focused on spectral envelope. However, voiced sounds consist of several harmonics with time varying magnitudes and frequencies. This paper analizes voice quality by focusing on the harmonics of actual voiced speech. Five Japanese vowels uttered by two female speakers were examined. Harmonic extraction used a sinusoidal model in which M & Q algorithm was used to estimate frequency domain trajectory for each harmonic. Two voice quality related results were acqired. First, the increased rate of standard deviation of instantaneous frequencies depeneds on speakers. Second, mean magnitude of harmonics in the lower frequencies has more voice quality in comparison with LPC spectral envelopes. These results suggest avenues to improve voice quality control in speech synthesis.

