One-to-many eigenvoice conversion (EVC) allows the conversion of a specific source speaker into arbitrary target speakers. Eigenvoice Gaussian mixture model (EV-GMM) is trained in advance with multiple parallel data sets consisting of the source speaker and many pre-stored target speakers. The EV-GMM is adapted for arbitrary target speakers using only a few utterances by estimating a small number of free parameters. Therefore, the initial EV-GMM directly affects the conversion performance of the adapted EV-GMM. In order to prepare a better initial model, this paper proposes Speaker Adaptive Training (SAT) of a canonical EV-GMM in one-to-many EVC. Results of objective and subjective evaluations demonstrate that SAT causes significant improvements in the performance of EVC.
Bibliographic reference. Ohtani, Yamato / Toda, Tomoki / Saruwatari, Hiroshi / Shikano, Kiyohiro (2007): "Speaker adaptive training for one-to-many eigenvoice conversion based on Gaussian mixture model", In INTERSPEECH-2007, 1981-1984.