INTERSPEECH 2007
8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Vocal Conversion from Speaking Voice to Singing Voice Using STRAIGHT

Takeshi Saitou (1), Masataka Goto (1), Masashi Unoki (2), Masato Akagi (2)

(1) National Institute of Advanced Industrial Science and Technology (AIST)
(2) School of Information Science, Japan Advanced Institute of Science and Technology, Tokyo, Japan

A vocal conversion system that can synthesize a singing voice given a speaking voice and a musical score is proposed. It is based on the speech manipulation system STRAIGHT [1], and comprises three models controlling three acoustic features unique to singing voices: the F0, duration, and spectral envelope. Given the musical score and its tempo, the F0 control model generates the F0 contour of the singing voice by controlling four F0 fluctuations: overshoot, vibrato, preparation, and fine fluctuation. The duration control model lengthens the duration of each phoneme in the speaking voice by considering the duration of its musical note. The spectral control model converts the spectral envelope of the speaking voice into that of the singing voice by controlling both the singing formant and the amplitude modulation of formants in synchronization with vibrato. Experimental results showed that the proposed system could convert speaking voices into singing voices whose quality resembles that of actual singing voices.

Full Paper

Acoustic Material

input_speaking_male.wav This is a male voice that is reading the lyrics of a Japanese children's song "Nanatsunoko".
input_speaking_female.wav This is a female voice that is reading the lyrics of a Japanese children's song "Nanatsunoko".
synthesized_singing_male.wav This is a male synthesized singing voice converted from input_speaking_male.wav by using our proposed system.
synthesized_singing_female.wav  This is a female synthesized singing voice converted from input_speaking_female.wav by using our proposed system.

Bibliographic reference.  Saitou, Takeshi / Goto, Masataka / Unoki, Masashi / Akagi, Masato (2007): "Vocal conversion from speaking voice to singing voice using STRAIGHT", In INTERSPEECH-2007, 4005-4006.