ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

An efficient speaker adaptation method for TTS duration model

Wentao Gu, Chilin Shih, Jan P.H. van Santen

This paper is a continuation of our previous study [1], where an efficient speaker adaptation method was proposed for TTS duration model. The goal was achieved by text selection and weight estimation. The result there was preliminary because itÂ’s only derived from one sentence set. After the analysis on multiple sentence sets, we can now evaluate the robustness of the method better and hence a more confident conclusion is given. Based on the observation that some language-specific information is well preserved across speakers, the proposed method is supported. By a further comparison between various adaptation models, the linear weighted model shows the best performance, and therefore presents an efficient way to adapt the duration model from the source speaker to target speakers with a very small training corpus.


doi: 10.21437/Eurospeech.1999-401

Cite as: Gu, W., Shih, C., Santen, J.P.H.v. (1999) An efficient speaker adaptation method for TTS duration model. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 1839-1842, doi: 10.21437/Eurospeech.1999-401

@inproceedings{gu99_eurospeech,
  author={Wentao Gu and Chilin Shih and Jan P.H. van Santen},
  title={{An efficient speaker adaptation method for TTS duration model}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={1839--1842},
  doi={10.21437/Eurospeech.1999-401}
}