8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Speaker Conversion in ARX-Based Source-Formant Type Speech Synthesis

Hiroki Mori, Hideki Kasuya

Utsunomiya University, Japan

A speaker conversion framework for formant synthesis is proposed. With this framework, given a small set of a target speaker's utterances, segmental features of an original speech can be converted to those of the given speaker. Unlike other speaker conversion frameworks, further voice quality modification can also be applied to the converted speech with conventional formant modification techniques. The parameter conversion is based on MLLR in the cepstral domain. The effect of parameter conversion can be seen from the graphical representation of formant placement. The results of an auditory experiment showed that most of the converted speech was perceived as being similar to that of target speakers.

Full Paper

Bibliographic reference.  Mori, Hiroki / Kasuya, Hideki (2003): "Speaker conversion in ARX-based source-formant type speech synthesis", In EUROSPEECH-2003, 2421-2424.