8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Formant-based Synthesis of Singing

Sten Ternström, Johan Sundberg

Department of Speech, Music and Hearing, School of Computer Science and Communication, Kungliga Tekniska Högskolan, Stockholm, Sweden

Rule-driven formant synthesis is a legacy technique that still has certain advantages over currently prevailing methods. The memory footprint is small and the flexibility is high. Using a modular, interactive synthesis engine, it is easy to test the perceptual effect of different source waveform and formant filter configurations. The rule system allows the investigation of how different styles and singer voices are represented in the low-level acoustic features, without changing the score. It remains difficult to achieve natural-sounding consonants and to integrate the higher abstraction levels of musical expression.

Full Paper

Acoustic Material

summertime_1_baritone.wav Operatic baritone. This is the default rule set. A new control of spectrum slope was implemented for this example.
summertime_2_mixed.wav Mixed: Formant settings obtained from a female singer reportedly performing in a mode that is intermediate between opera and musical theatre.
summertime_3_malejazz.wav  Male jazz club singer. More aspirative noise. Vibrato is added only toward the end of long tones.
summertime_4_child.wav Child singer. The sampling rate was raised to 22050 Hz and a simpler source pulse with a steeper spectrum rolloff was used. The higher formants were scaled up about 1.5 times and more aspirative noise was added. The formant bandwidths were doubled, and the vibrato was removed.

Bibliographic reference.  Ternström, Sten / Sundberg, Johan (2007): "Formant-based synthesis of singing", In INTERSPEECH-2007, 4013-4014.