Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

S5: The SQEL Slovene Speech Synthesis System

N. Pavesic, Jerneja Gros

Artificial Perception Laboratory, Faculty of Electrical Engineering, University of Ljubljana, Slovenia

An improved version of the Slovene text-to-speech system S5 is described. S5 can be used either as a stand-alone reading system or it can be integrated into other applications. S5 is based on concatenation of basic speech units, diphones, using the TD-PSOLA technique. The input text is transformed into its spoken equivalent by a series of modules. F0 modeling is based primarily on predicting the appropriate tonemic accent. Phone duration is predicted by a two level approach, taking into account how acceleration or slowing down applies to the duration of individual phones. The adequacy of the spoken output was evaluated by several subjective tests as they are recommended by the International Telecommunication Union (ITU).

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Pavesic, N. / Gros, Jerneja (1999): "S5: the SQEL slovene speech synthesis system", In EUROSPEECH'99, 2103-2106.