Interspeech'2005 - Eurospeech
This paper addresses the generation and evaluation of foreignaccented speech in concatenative text-to-speech (TTS) synthesis. We describe three possible methods of building a Spanish-accented English voice, and evaluate and compare them with respect to preference, intelligibility, and smoothness. Effects of speaking rate and content are also examined.
It is found that although using an unmodified Spanish voice to read English text is possible, the result is not highly intelligible. With some modifications to the linguistic model, a relatively high level of comprehensibility and smoothness can be achieved, not differing widely from ratings given to a native voice at a comparable stage of development. Listeners in perceptual experiments were very consistent in their preference rankings of the three voices, showing that differences in voice-building method are both detectable and contribute to synthesis quality.
Bibliographic reference. Tomokiyo, Laura Mayfield / Black, Alan W. / Lenzo, Kevin A. (2005): "Foreign accents in synthetic speech: development and evaluation", In INTERSPEECH-2005, 1469-1472.