ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Maximum a posteriori adaptation for many-to-one eigenvoice conversion

Daisuke Tani, Tomoki Toda, Yamato Ohtani, Hiroshi Saruwatari, Kiyohiro Shikano

Many-to-one eigenvoice conversion (EVC) allows the conversion from an arbitrary speaker's voice into the pre-determined target speaker's voice. In this method, a canonical eigenvoice Gaussian mixture model is effectively adapted to any source speaker using only a few utterances as the adaptation data. In this paper, we propose a many-to-one EVC based on maximum a posteriori (MAP) adaptation for further improving the robustness of the adaptation process to the amount of adaptation data. Results of objective and subjective evaluations demonstrate that the proposed method is the most effective among the other conventional many-to-one VC methods when using any amount of adaptation data (e.g., from 300 ms to 16 utterances).


doi: 10.21437/Interspeech.2008-421

Cite as: Tani, D., Toda, T., Ohtani, Y., Saruwatari, H., Shikano, K. (2008) Maximum a posteriori adaptation for many-to-one eigenvoice conversion. Proc. Interspeech 2008, 1461-1463, doi: 10.21437/Interspeech.2008-421

@inproceedings{tani08_interspeech,
  author={Daisuke Tani and Tomoki Toda and Yamato Ohtani and Hiroshi Saruwatari and Kiyohiro Shikano},
  title={{Maximum a posteriori adaptation for many-to-one eigenvoice conversion}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={1461--1463},
  doi={10.21437/Interspeech.2008-421}
}