12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Automatic Prosody Generation for Serbo-Croatian Speech Synthesis Based on Regression Trees

Milan Sečujski (1), Darko Pekar (2), Nikša Jakovljević (1)

(1) University of Novi Sad, Serbia
(2) AlfaNum - Speech Technologies Ltd., Serbia

The paper presents the module for automatic generation of prosodic features of synthesized speech, namely, f0 targets and phonetic segment durations, within the speech synthesizer AlfaNumTTS, the most sophisticated speech synthesis system for Serbo-Croatian language to date. The module is based on regression trees trained on a studio recorded single speaker database of Serbo-Croatian. The database has been annotated for phonemic identity as well as a number of prosodic events such as pitch accents, phrase breaks and prosodic prominence. Besides the traditional description of the intonational phonology of Serbo-Croatian through four distinct accent types, within this study we have examined the possibility of representing them as tonal sequences, which has been suggested in recent linguistic literature. The results obtained confirm that the four accents can indeed be reduced to sequences of high and low tones without loss of quality, provided that phonemic length contrast is preserved.

Full Paper

Bibliographic reference.  Sečujski, Milan / Pekar, Darko / Jakovljević, Nikša (2011): "Automatic prosody generation for serbo-croatian speech synthesis based on regression trees", In INTERSPEECH-2011, 3157-3160.