This paper discusses some ideas for the requirements and methods of conversational speech synthesis, based on experience gained from the collection and analysis of a very large corpus of conversational speech in a variety of real-life everyday contexts. It shows that because variation in voice quality plays a significant part in the transmission of interpersonal and affect-related social information, this feature should be given priority in future speech synthesis research. Several solutions to this problem are proposed.
Cite as: Campbell, N. (2007) Towards conversational speech synthesis; lessons learned from the expressive speech processing project. Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6), 22-27
@inproceedings{campbell07_ssw, author={Nick Campbell}, title={{Towards conversational speech synthesis; lessons learned from the expressive speech processing project}}, year=2007, booktitle={Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6)}, pages={22--27} }