5th International Conference on Spoken Language Processing
This paper discusses a methodology using a minimal set of sentences to adapt an existing TTS duration model to capture inter-speaker variations. The assumption is that the original duration database contains information of both language-specific and speaker-specific duration characteristics. In training a duration model for a new speaker, only the speaker-specific information needs to be modeled, therefore the size of the training data can be reduced drastically. Results from several experiments are compared and discussed.
Bibliographic reference. Shih, Chilin / Gu, Wentao / Santen, Jan P. H. van (1998): "Efficient adaptation of TTS duration model to new speakers", In ICSLP-1998, paper 0177.