Speech Prosody 2004

Nara, Japan
March 23-26, 2004

Development of the F0 Control Model for Singing-Voices Synthesis

Takeshi Saitou, Masashi Unoki, Masato Akagi

School of Information Science, Japan Advanced Institute of Science and Technology, Japan

Fundamental frequency (F0) control models for singing voices are required to construct singing-voice synthesis systems that can generate natural singing-voices. This paper describes the development of an F0 control model for singing-voices synthesis. F0 fluctuations are revealed as characteristics that need to control the F0 contour of singing-voices by investigating how much they influence singing-voices perception through psycho-acoustical experiments. These fluctuations have wider dynamic range and more complicated changes rather than in speaking-voices. The F0 control model is developed so that it can control important F0 fluctuations for the purpose of singing-voice perception. The singing-voice synthesis method using the F0 control model is proposed to synthesize natural singing-voices. Results of these experiments show that the F0 fluctuations are significant factors for singing-voices perception; the F0 control model can generate F0 contours of singing-voices and can be applied to synthesize natural singing-voices.

Full Paper

Bibliographic reference.  Saitou, Takeshi / Unoki, Masashi / Akagi, Masato (2004): "Development of the F0 control model for singing-voices synthesis", In SP-2004, 491-494.