Sixth European Conference on Speech Communication and Technology
This paper is a continuation of our previous study , where an efficient speaker adaptation method was proposed for TTS duration model. The goal was achieved by text selection and weight estimation. The result there was preliminary because itís only derived from one sentence set. After the analysis on multiple sentence sets, we can now evaluate the robustness of the method better and hence a more confident conclusion is given. Based on the observation that some language-specific information is well preserved across speakers, the proposed method is supported. By a further comparison between various adaptation models, the linear weighted model shows the best performance, and therefore presents an efficient way to adapt the duration model from the source speaker to target speakers with a very small training corpus.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Gu, Wentao / Shih, Chilin / Santen, Jan P.H. van (1999): "An efficient speaker adaptation method for TTS duration model", In EUROSPEECH'99, 1839-1842.