EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Perceptually Weighted Linear Transformations for Voice Conversion

Hui Ye, Steve Young

Cambridge University, U.K.

Voice conversion is a technique for modifying a source speaker's speech to sound as if it was spoken by a target speaker. A popular approach to voice conversion is to apply a linear transformation to the spectral envelope. However, conventional parameter estimation based on least square error optimization does not necessarily lead to the best perceptual result. In this paper, a perceptually weighted linear transformation is presented which is based on the minimization of the perceptual spectral distance between the voices of the source and target speakers. The paper describes the new conversion algorithm and presents a preliminary evaluation of the performance of the method based on objective and subjective tests.

Full Paper

Bibliographic reference.  Ye, Hui / Young, Steve (2003): "Perceptually weighted linear transformations for voice conversion", In EUROSPEECH-2003, 2409-2412.