![]() |
ITRW on Non-Linear Speech Processing
|
![]() |
Hidden Markov Models based text-to-speech (HMM-TTS) synthesis is a technique for generating speech from trained statistical models where spectrum, pitch and durations of basic speech units are modelled altogether. The aim of this work is to describe a Spanish HMM-TTS system using CBR as a F0 estimator, analysing its performance objectively and subjectively. The experiments have been conducted on a reliable labelled speech corpus, whose units have been clustered using contextual factors according to the Spanish language. The results show that the CBR-based F0 estimation is capable of improving the HMM-based baseline performance when synthesizing nondeclarative short sentences and reduced contextual information is available.
Bibliographic reference. Gonzalvo, Xavi / Iriondo, Ignasi / Socoró, Joan Claudi / Alías, Francesc / Monzo, Carlos (2007): "HMM-based Spanish speech synthesis using CBR as F0 estimator", In NOLISP-2007, 7-10.