EUROSPEECH 2003 - INTERSPEECH 2003
A speaker conversion framework for formant synthesis is proposed. With this framework, given a small set of a target speaker's utterances, segmental features of an original speech can be converted to those of the given speaker. Unlike other speaker conversion frameworks, further voice quality modification can also be applied to the converted speech with conventional formant modification techniques. The parameter conversion is based on MLLR in the cepstral domain. The effect of parameter conversion can be seen from the graphical representation of formant placement. The results of an auditory experiment showed that most of the converted speech was perceived as being similar to that of target speakers.
Bibliographic reference. Mori, Hiroki / Kasuya, Hideki (2003): "Speaker conversion in ARX-based source-formant type speech synthesis", In EUROSPEECH-2003, 2421-2424.