10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Context-Dependent Additive log F_0 Model for HMM-Based Speech Synthesis

Heiga Zen, Norbert Braunschweiler

Toshiba Research Europe Ltd., UK

This paper proposes a context-dependent additive acoustic modelling technique and its application to logarithmic fundamental frequency (log F0) modelling for HMM-based speech synthesis. In the proposed technique, mean vectors of state-output distributions are composed as the weighted sum of decision tree-clustered context-dependent bias terms. Its model parameters and decision trees are estimated and built based on the maximum likelihood (ML) criterion. The proposed technique has the potential to capture the additive structure of log F0 contours. A preliminary experiment using a small database showed that the proposed technique yielded encouraging results.

Full Paper

Bibliographic reference.  Zen, Heiga / Braunschweiler, Norbert (2009): "Context-dependent additive log f_0 model for HMM-based speech synthesis", In INTERSPEECH-2009, 2091-2094.