By proper control of prosody, text-to-speech systems already have the capability to imitate distinctive speaking styles. We show two examples where we can capture the critical features: the singing style of Dinah Shore and the speaking style of Martin Luther King Jr. The styles are described by Stem-ML tags (soft template mark-up language), which offers the flexibility needed to control accent shapes, phrasal pitch contours, and amplitude profiles, for speech as well as for singing.
Cite as: Shih, C., Kochanski, G. (2001) Prosody control for speaking and singing styles. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 669-672, doi: 10.21437/Eurospeech.2001-175
@inproceedings{shih01_eurospeech, author={Chilin Shih and Greg Kochanski}, title={{Prosody control for speaking and singing styles}}, year=2001, booktitle={Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001)}, pages={669--672}, doi={10.21437/Eurospeech.2001-175} }