Cross-language voice conversion maps the speech of speaker S1 in language L1 to the voice of speaker S2 using knowledge only of how S2 speaks a different language L2. This mapping is usually performed using speech material from S1 and S2 that has been deemed "equivalent" in either acoustic or phonetic terms. This study investigates the issue of equivalence in more detail, and contrasts the performance of a voice conversion system operating in both mono-lingual and cross-lingual modes using Japanese and English. We show that voice conversion impacts the intelligibility of the converted speech, but to a significantly greater degree for cross-language conversion. A phonetic comparison of the monolingual and cross-language converted speech suggests that consonantal information is degraded in both conditions, but vowel information is degraded more in the cross-language condition.
Bibliographic reference. Yanagisawa, Kayoko / Huckvale, Mark (2008): "A phonetic assessment of cross-language voice conversion", In INTERSPEECH-2008, 593-596.