Most of the previous research on speech user interfaces has focused on what information should be presented to the user. Equally important is the question of how this information should be presented. Although speech synthesis is quite intelligible in well-formed and simple sentences, it may be very difficult to understand when complex structural elements, like tables or URLs, are spoken. We arranged a controlled experiment to identify the prosodic features that affect the intelligibility and pleasantness of synthetic speech. Pauses were found to make a significant difference in comprehension. Good variation in pitch and rate seem to make a voice more pleasant to listen to but have only minor positive effect on comprehension. We analyzed the exact ways in which human readers used prosodic elements so that we could construct unique and human like computer persons for spoken dialogue applications.
Cite as: Hakulinen, J., Turunen, M., Räihä, K.-J. (1999) The use of prosodic features to help users extract information from structured elements in spoken dialogue systems. Proc. ETRW on Dialogue and Prosody, 65-70
@inproceedings{hakulinen99_diapro, author={Jaakko Hakulinen and Markku Turunen and Kari-Jouko Räihä}, title={{The use of prosodic features to help users extract information from structured elements in spoken dialogue systems}}, year=1999, booktitle={Proc. ETRW on Dialogue and Prosody}, pages={65--70} }