8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Probability Models of Formant Parameters for Voice Conversion

Dimitrios Rentzos (1), Saeed Vaseghi (1), Qin Yan (1), Ching-Hsiang Ho (2), Emir Turajlic (1)

(1) Brunel University, U.K.
(2) Fortune Institute of Technology, Taiwan

This paper explores the estimation and mapping of probability models of formant parameter vectors for voice conversion. The formant parameter vectors consist of the frequency, bandwidth and intensity of resonance at formants. Formant parameters are derived from the coefficients of a linear prediction (LP) model of speech. The formant distributions are modelled with phoneme-dependent two-dimensional hidden Markov models with state Gaussian mixture densities. The HMMs are subsequently used for re-estimation of the formant trajectories of speech. Two alternative methods are explored for voice morphing. The first is a non-uniform frequency warping method and the second is based on spectral mapping via rotation of the formant vectors of the source towards those of the target. Both methods transform all formant parameters (Frequency, Bandwidth and Intensity). In addition, the factors that affect the selection of the warping ratios for the mapping function are presented. Experimental evaluation of voice morphing examples is presented.

