The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis
This paper explores the benefits of transforming spectral peaks in voice conversion. First, in examining classic GMM-based transformation with cepstral coefficients, we show that the lack of transformed data variance ("over-smoothing") can be related to the choice of spectral parameterization. Consequently, we propose an alternative parameterization using spectral peaks. The peaks are transformed using HMMs with Gaussian state distributions. Two learning variants and post-processing treating peak evolution in time are also examined. In comparing the different transformation approaches, spectral peaks are shown to offer higher interspeaker feature correlation and yield higher transformed data variance than their cepstral coefficient counterparts.
Index Terms: voice conversion, spectral transformation, spectral peaks
Bibliographic reference. Godoy, Elizabeth / Rosec, Olivier / Chonavel, Thierry (2010): "On transforming spectral peaks in voice conversion", In SSW7-2010, 68-73.