ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification
We describe experiments aimed at quantifying the effectiveness of whole-spectrum multiplicative scaling, with different scaling factors k, as a voice-conversion technique. A review of the literature indicated that the fundamental frequency for female excitation is typically a factor of 1.7 greater than for male excitation, whereas female formants are only some 1.16 times higher, indicating that a single, global setting of k can only be a compromise between competing requirements to scale properly the excitation and envelope parts of the spectrum. Nonetheless, we show that the technique can achieve a useful degree of conversion. While female-to-male transformation was more successful in terms of perceived gender change than vice versa, male speech appeared more robust in terms of retaining naturalness and intelligibility when transformed.
Bibliographic reference. Chan, P. A. / Damper, Robert I. (1994): "Voice conversion by whole-spectrum scaling", In ASRIV-1994, 165-168.