Interspeech'2005 - Eurospeech
In this paper, we report on a comparative user study about the quality of mobile speech synthesis methods. We measured the impact of device class, data rate, synthesis method (diphone vs. non-uniform unit-selection) and lexicon usage on speech quality (word comprehension and several subjective satisfaction metrics). Seven practically relevant speech synthesis implementations and one natural voice were evaluated, applying the method recommended in ITU-T P.85, with additional pairwise comparisons. As a general result, although the overall subjective ratings of the synthetic voices differed significantly, the word comprehension rates were quite similar. We found a significant impact of data rate and synthesis method on the mean subjective speech quality, but not on word comprehension. The use of a lexicon in embedded speech synthesis slightly improved the perceived pronunciation quality.
Bibliographic reference. Pucher, Michael / Fröhlich, Peter (2005): "A user study on the influence of mobile device class, synthesis method, data rate and lexicon on speech synthesis quality", In INTERSPEECH-2005, 2501-2504.