Attempts to add emotion effects to synthesised speech have existed for more than a decade now. Several prototypes and fully operational systems have been built based on different synthesis techniques, and quite a number of smaller studies have been conducted. This paper aims to give an overview of what has been done in this field, pointing out the inherent properties of the various synthesis techniques used, summarising the prosody rules employed, and taking a look at the evaluation paradigms. Finally, an attempt is made to discuss interesting directions for future development.
Cite as: Schröder, M. (2001) Emotional speech synthesis: a review. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 561-564, doi: 10.21437/Eurospeech.2001-150
@inproceedings{schroder01b_eurospeech, author={Marc Schröder}, title={{Emotional speech synthesis: a review}}, year=2001, booktitle={Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001)}, pages={561--564}, doi={10.21437/Eurospeech.2001-150} }