ISCA Archive ASRIV 1994
ISCA Archive ASRIV 1994

Voice conversion by whole-spectrum scaling

P. A. Chan, Robert I. Damper

We describe experiments aimed at quantifying the effectiveness of whole-spectrum multiplicative scaling, with different scaling factors k, as a voice-conversion technique. A review of the literature indicated that the fundamental frequency for female excitation is typically a factor of 1.7 greater than for male excitation, whereas female formants are only some 1.16 times higher, indicating that a single, global setting of k can only be a compromise between competing requirements to scale properly the excitation and envelope parts of the spectrum. Nonetheless, we show that the technique can achieve a useful degree of conversion. While female-to-male transformation was more successful in terms of perceived gender change than vice versa, male speech appeared more robust in terms of retaining naturalness and intelligibility when transformed.

Cite as: Chan, P.A., Damper, R.I. (1994) Voice conversion by whole-spectrum scaling. Proc. ESCA Workshop on Automatic Speaker Recognition, Identification and Verification, 165-168

  author={P. A. Chan and Robert I. Damper},
  title={{Voice conversion by whole-spectrum scaling}},
  booktitle={Proc. ESCA Workshop on Automatic Speaker Recognition, Identification and Verification},