Spoken language conversion (SLC) aims to generate utterances in the voice of a speaker but in a language unknown to them, using speech synthesis systems and speech processing techniques. Previous approaches to SLC have been based on cross-language voice conversion (VC), which has underlying assumptions that ignore phonetic and phonological differences between languages, leading to a reduction in intelligibility of the output. Accent morphing (AM) was proposed as an alternative approach, and its intelligibility performance was investigated in a previous study. AM attempts to preserve the voice characteristics of the target speaker whilst modifying their accent, using phonetic knowledge obtained from a native speaker of the target language. This paper examines AM and VC in terms of how similar the output sounds like the target speaker. AM achieved similarity ratings at least equivalent to VC, but the study highlighted various difficulties in evaluating speaker identity in a SLC context.
Full Paper Audio Files
Bibliographic reference. Yanagisawa, Kayoko / Huckvale, Mark (2010): "A phonetic alternative to cross-language voice conversion in a text-dependent context: evaluation of speaker identity", In INTERSPEECH-2010, 2150-2153.