EUROSPEECH 2003 - INTERSPEECH 2003
This paper explores the estimation and mapping of probability models of formant parameter vectors for voice conversion. The formant parameter vectors consist of the frequency, bandwidth and intensity of resonance at formants. Formant parameters are derived from the coefficients of a linear prediction (LP) model of speech. The formant distributions are modelled with phoneme-dependent two-dimensional hidden Markov models with state Gaussian mixture densities. The HMMs are subsequently used for re-estimation of the formant trajectories of speech. Two alternative methods are explored for voice morphing. The first is a non-uniform frequency warping method and the second is based on spectral mapping via rotation of the formant vectors of the source towards those of the target. Both methods transform all formant parameters (Frequency, Bandwidth and Intensity). In addition, the factors that affect the selection of the warping ratios for the mapping function are presented. Experimental evaluation of voice morphing examples is presented.
Bibliographic reference. Rentzos, Dimitrios / Vaseghi, Saeed / Yan, Qin / Ho, Ching-Hsiang / Turajlic, Emir (2003): "Probability models of formant parameters for voice conversion", In EUROSPEECH-2003, 2405-2408.