EUROSPEECH 2003 - INTERSPEECH 2003
Voice conversion is a technique for modifying a source speaker's speech to sound as if it was spoken by a target speaker. A popular approach to voice conversion is to apply a linear transformation to the spectral envelope. However, conventional parameter estimation based on least square error optimization does not necessarily lead to the best perceptual result. In this paper, a perceptually weighted linear transformation is presented which is based on the minimization of the perceptual spectral distance between the voices of the source and target speakers. The paper describes the new conversion algorithm and presents a preliminary evaluation of the performance of the method based on objective and subjective tests.
Bibliographic reference. Ye, Hui / Young, Steve (2003): "Perceptually weighted linear transformations for voice conversion", In EUROSPEECH-2003, 2409-2412.