The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis

Kyoto, Japan
September 22-24, 2010

Refined Statistical Model Tuning for Speech Synthesis

Xu Shao, Vincent Pollet, Andrew Breen

TTS R&D, Nuance Communications

This paper describes a number of approaches to refine and tune statistical models for speech synthesis. The first approach is to tune the sizes of the decision trees for central phonemes in a context. The second approach is a refinement technique for HMM models; a variable number of states for hidden semi- Markov models is emulated. A so-called “hard state-skip” training technique is introduced into the standard forwardbackward training. The results show that both the tune and refinement techniques lead to increased flexibility for speech synthesis modeling.

Index Terms: TTS, HSMM, decision tree, hard skip-state

Full Paper

Bibliographic reference.  Shao, Xu / Pollet, Vincent / Breen, Andrew (2010): "Refined statistical model tuning for speech synthesis", In SSW7-2010, 284-287.