Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Straight-Based Voice Conversion Algorithm Based on Gaussian Mixture Model

Tomoki Toda, Jinlin Lu, Hiroshi Saruwatari, Kiyohiro Shikano

Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma, Nara, Japan

The voice conversion algorithm based on the Gaussian mixture model (GMM) has also been proposed by Stylianou et al. In this algorithm, the acoustic space of a speaker is represented continuously. In this paper, we apply this GMM-based voice conversion algorithm to STRAIGHT proposed by Kawahara et al., which is recognized as a high quality vocoder. In order to evaluate this voice conversion algorithm, we perform subjective and objective experiments on speech quality and speaker individuality, comparing with the method based on the codebook mapping. As results, the performance of the GMM-based voice conversion algorithm is better than that of the codebook mapping method. Effects by the amount of training data for the voice conversion algorithms are also investigated, as well as the number of the Gaussian mixtures. These evaluation results clarify that the GMM-based voice conversion algorithm is successfully applied to STRAIGHT.

Full Paper

Bibliographic reference.  Toda, Tomoki / Lu, Jinlin / Saruwatari, Hiroshi / Shikano, Kiyohiro (2000): "Straight-based voice conversion algorithm based on Gaussian mixture model", In ICSLP-2000, vol.3, 279-282.