Voice conversion involves transforming segments of speech from a source speaker to make them to be perceived as if spoken by a target speaker. Generally, this process involves the estimation of vocal tract parameters and an excitation signal that match the target speaker. The work presented here proposes an algorithm for estimating the excitation residuals of the target speaker using a weighted combination of clustered residuals. The algorithm is subjected to objective and subjective comparisons to other basic types of residual estimation techniques for voice conversion. Tests were carried on 2 male and 2 female target speakers in an ideal setting. The overall goal of this work is to create an improved algorithm for estimating excitation residuals during voice conversion that maintain speaker recognizability and high synthesis quality.
Bibliographic reference. Percybrooks, Winston S. / Moore, Elliot (2007): "New algorithm for LPC residual estimation from LSF vectors for a voice conversion system", In INTERSPEECH-2007, 1977-1980.