12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

A Hybrid TTS Approach for Prosody and Acoustic Modules

Iñaki Sainz, Daniel Erro, Eva Navas, Inma Hernáez

Universidad del País Vasco, Spain

Unit selection (US) TTSs generate quite natural speech but highly variable in quality. Statistical parametric (SP) systems offer far more consistent quality but reduced naturalness due to its vocoding nature. We present a hybrid approach (HA) that tries to improve the overall naturalness combining both synthesis methods. Contrary to other works, the fusion of methods is performed both in prosody and acoustic modules yielding a more robust prosody prediction and achieving greater naturalness. Objective and subjective experiments show the validity of our procedure.

Full Paper

Bibliographic reference.  Sainz, Iñaki / Erro, Daniel / Navas, Eva / Hernáez, Inma (2011): "A hybrid TTS approach for prosody and acoustic modules", In INTERSPEECH-2011, 333-336.