5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Efficient Adaptation of TTS Duration Model to New Speakers

Chilin Shih (1), Wentao Gu (2), Jan P. H. van Santen (1)

(1) Bell Laboratories, Lucent Technologies, USA
(2) Shanghai Jiaotong University, China

This paper discusses a methodology using a minimal set of sentences to adapt an existing TTS duration model to capture inter-speaker variations. The assumption is that the original duration database contains information of both language-specific and speaker-specific duration characteristics. In training a duration model for a new speaker, only the speaker-specific information needs to be modeled, therefore the size of the training data can be reduced drastically. Results from several experiments are compared and discussed.

Full Paper

Bibliographic reference.  Shih, Chilin / Gu, Wentao / Santen, Jan P. H. van (1998): "Efficient adaptation of TTS duration model to new speakers", In ICSLP-1998, paper 0177.