ISCA Archive SSW 2007
ISCA Archive SSW 2007

Regression approaches to voice quality controll based on one-to-many eigenvoice conversion

Kumi Ohta, Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano

This paper proposes techniques for flexibly controlling voice quality of converted speech from a particular source speaker based on one-to-many eigenvoice conversion (EVC). EVC realizes a voice quality control based on the manipulation of a small number of parameters, i.e., weights for eigenvectors, of an eigenvoice Gaussian mixture model (EV-GMM), which is trained with multiple parallel data sets consisting of a single source speaker and many pre-stored target speakers. However, it is difficult to control intuitively the desired voice quality with those parameters because each eigenvector doesn’t usually represent a specific physical meaning. In order to cope with this problem, we propose regression approaches to the EVC-based voice quality controller. The tractable voice quality control of the converted speech is achieved with a low-dimensional voice quality control vector capturing specific voice characteristics. We conducted experimental verifications of each of the proposed approaches.


Cite as: Ohta, K., Ohtani, Y., Toda, T., Saruwatari, H., Shikano, K. (2007) Regression approaches to voice quality controll based on one-to-many eigenvoice conversion. Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6), 101-106

@inproceedings{ohta07_ssw,
  author={Kumi Ohta and Yamato Ohtani and Tomoki Toda and Hiroshi Saruwatari and Kiyohiro Shikano},
  title={{Regression approaches to voice quality controll based on one-to-many eigenvoice conversion}},
  year=2007,
  booktitle={Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6)},
  pages={101--106}
}