INTERSPEECH 2007
8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

An HMM-Based Speech Synthesis System Applied to German and Its Adaptation to a Limited Set of Expressive Football Announcements

Sacha Krstulović, Anna Hunecke, Marc Schröder

DFKI GmbH, Germany

The paper assesses the capability of an HMM-based TTS system to produce German speech. The results are discussed in qualitative terms, and compared over three different choices of context features. In addition, the system is adapted to a small set of football announcements, in an exploratory attempt to synthesise expressive speech. We conclude that the HMMs are able to produce highly intelligible neutral German speech, with a stable quality, and that the expressivity is partially captured in spite of the small size of the football dataset.

Full Paper

Acoustic Demonstration Examples
Synthetic female voice, 5 context features
Synthetic male voice, 5 context features
Synthetic male voice, 29 context features
Synthetic male voice, 57 context features
Synthetic female voice, 29 context features
Synthetic female voice, 57 context features
Synthetic male voice, 29 context features
Synthetic male voice, 29 context features
Human male voice
Synthetic male voice, 29 context features, adaptation to a small sample of neutral football comments; out of domain sentence
Synthetic male voice, 29 context features, adaptation to a small sample of excited football comments; out of domain sentence.
Synthetic male voice, 29 context features, adaptation to a small sample of excited football comments; in-domain sentence.

Bibliographic reference.  Krstulović, Sacha / Hunecke, Anna / Schröder, Marc (2007): "An HMM-based speech synthesis system applied to German and its adaptation to a limited set of expressive football announcements", In INTERSPEECH-2007, 1897-1900.