ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification

Martigny, Switzerland
April 7-9, 1994

Voice Conversion by Whole-Spectrum Scaling

P. A. Chan, Robert I. Damper

Department of Electronics and Computer Science, University of Southampton, England, UK

We describe experiments aimed at quantifying the effectiveness of whole-spectrum multiplicative scaling, with different scaling factors k, as a voice-conversion technique. A review of the literature indicated that the fundamental frequency for female excitation is typically a factor of 1.7 greater than for male excitation, whereas female formants are only some 1.16 times higher, indicating that a single, global setting of k can only be a compromise between competing requirements to scale properly the excitation and envelope parts of the spectrum. Nonetheless, we show that the technique can achieve a useful degree of conversion. While female-to-male transformation was more successful in terms of perceived gender change than vice versa, male speech appeared more robust in terms of retaining naturalness and intelligibility when transformed.

