ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Unsupervised adaptation for HMM-based speech synthesis

Simon King, Keiichi Tokuda, Heiga Zen, Junichi Yamagishi

It is now possible to synthesise speech using HMMs with a comparable quality to unit-selection techniques. Generating speech from a model has many potential advantages over concatenating waveforms. The most exciting is model adaptation. It has been shown that supervised speaker adaptation can yield high-quality synthetic voices with an order of magnitude less data than required to train a speaker-dependent model or to build a basic unit-selection system. Such supervised methods require labelled adaptation data for the target speaker. In this paper, we introduce a method capable of unsupervised adaptation, using only speech from the target speaker without any labelling.


doi: 10.21437/Interspeech.2008-186

Cite as: King, S., Tokuda, K., Zen, H., Yamagishi, J. (2008) Unsupervised adaptation for HMM-based speech synthesis. Proc. Interspeech 2008, 1869-1872, doi: 10.21437/Interspeech.2008-186

@inproceedings{king08_interspeech,
  author={Simon King and Keiichi Tokuda and Heiga Zen and Junichi Yamagishi},
  title={{Unsupervised adaptation for HMM-based speech synthesis}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={1869--1872},
  doi={10.21437/Interspeech.2008-186}
}