11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Adaptive Voice-Quality Control Based on One-to-Many Eigenvoice Conversion

Kumi Ohta (1), Tomoki Toda (2), Yamato Ohtani (3), Hiroshi Saruwatari (2), Kiyohiro Shikano (2)

(1) Brother Industries Ltd., Japan
(2) NAIST, Japan
(3) Toshiba Corporation, Japan

This paper presents adaptive voice-quality control methods based on one-to-many eigenvoice conversion. To intuitively control the converted voice quality by manipulating a small number of control parameters, a multiple regression Gaussian mixture model (MR-GMM) has been proposed. The MR-GMM also allows us to estimate the optimum control parameters if target speech samples are available. However, its adaptation performance is limited because the number of control parameters is too small to widely model voice quality of various target speakers. To improve the adaptation performance while keeping capability of voice-quality control, this paper proposes an extended MR-GMM (EMR-GMM) with additional adaptive parameters to extend a subspace modeling target voice quality. Experimental results demonstrate that the EMR-GMM yields significant improvements of the adaptation performance while allowing us to intuitively control the converted voice quality.

Full Paper

Bibliographic reference.  Ohta, Kumi / Toda, Tomoki / Ohtani, Yamato / Saruwatari, Hiroshi / Shikano, Kiyohiro (2010): "Adaptive voice-quality control based on one-to-many eigenvoice conversion", In INTERSPEECH-2010, 2158-2161.