ISCA Archive SSW 2010
ISCA Archive SSW 2010

Synthesis of listener vocalisations with imposed intonation contours

Sathish Pammi, Marc Schröder, Marcela Charfuelan, Oytun Türk, Ingmar Steiner

Synthesis of listener vocalisations is one of the focused research areas to improve emotionally coloured conversational speech synthesis. To communicate different intentions, a synthesiser should be capable of generating a broad range of vocalisations with different kinds of acoustic properties. However, the data collection for corpus based methods is necessarily limited in acoustic variability. This paper describes our approach to increase the acoustic variability of vocalisations in terms of intonation. After selecting the best candidate for a given target from among the available vocalisations, we use prosody modification techniques to impose a target intonation contour. In an experiment, we combine markedly distinct intonation contours with vocalisations differing in segmental form, using the prosody modification techniques MLSA vocoding, FD-PSOLA, and HNM. In a listening test, we evaluate the perceived naturalness of the resulting synthesised vocalisations, and assess the effect of segmental form, intonation contour and modification technique on perceived meaning.

Index Terms: listener vocalisations, pitch modification, FDPSOLA, HNM, MLSA Vocoding


Cite as: Pammi, S., Schröder, M., Charfuelan, M., Türk, O., Steiner, I. (2010) Synthesis of listener vocalisations with imposed intonation contours. Proc. 7th ISCA Workshop on Speech Synthesis (SSW 7), 240-245

@inproceedings{pammi10_ssw,
  author={Sathish Pammi and Marc Schröder and Marcela Charfuelan and Oytun Türk and Ingmar Steiner},
  title={{Synthesis of listener vocalisations with imposed intonation contours}},
  year=2010,
  booktitle={Proc. 7th ISCA Workshop on Speech Synthesis (SSW 7)},
  pages={240--245}
}