This paper proposes an approach to transform speech from a neutral style into other expressive styles using both prosody and voice quality (VoQ). The main aim is to validate the usefulness of VoQ in the enhancement of expressive synthetic speech. A Harmonic plus Noise Model (HNM) is used to modify speech following a set of rules extracted from an expressive speech corpus with five categories (neutral, happy, sensual, aggressive and sad). Finally, modified speech utterances were used to perform a perceptual test. These results indicate that listeners prefer prosody together with VoQ transformation instead of only prosody modification.
Index Terms: Expressive speech transformation, voice quality, prosody, Harmonic plus Noise Model
Cite as: Monzo, C., Calzada, A., Iriondo, I., Socoro, J.C. (2010) Expressive speech style transformation: voice quality and prosody modification using a harmonic plus noise model. Proc. Speech Prosody 2010, paper 985
@inproceedings{monzo10_speechprosody, author={Carlos Monzo and Angel Calzada and Ignasi Iriondo and Joan Claudi Socoro}, title={{Expressive speech style transformation: voice quality and prosody modification using a harmonic plus noise model}}, year=2010, booktitle={Proc. Speech Prosody 2010}, pages={paper 985} }