This paper presents an eye-tracking experiment comparing the processing of different accent patterns in unit selection synthesis and human speech. The synthetic speech results failed to replicate the facilitative effect of contextually appropriate accent patterns found with human speech, while producing a more robust intonational garden-path effect with contextually inappropriate patterns, both of which could be due to processing delays seen with the synthetic speech. As the synthetic speech was of high quality, the results indicate that eye tracking holds promise as a highly sensitive and objective method for the online evaluation of prosody in speech synthesis.
Bibliographic reference. White, Michael / Rajkumar, Rajakrishnan / Ito, Kiwako / Speer, Shari R. (2009): "Eye tracking for the online evaluation of prosody in speech synthesis: not so fast!", In INTERSPEECH-2009, 2523-2526.