Fifth ISCA ITRW on Speech Synthesis

June 14-16, 2004
Pittsburgh, PA, USA

Assessing the Acceptability of the SmartKom Speech Synthesis Voices

Antje Schweitzer (1), Norbert Braunschweiler (1,2), Grzegorz Dogil (1), Bernd Möbius (1)

(1) Institute of Natural Language Processing, University of Stuttgart, Germany
(2) Rhetorical Systems Ltd, Edinburgh, UK

The acceptability of the synthetic voices used by the multimodal SmartKom dialog system was tested in a series of experiments. Early in the project a first set of evaluation tasks was carried out to verify the intelligibility of the diphone voice which serves as the default voice for external open domain applications. The tests confirmed that the diphone voice produced satisfactory intelligibility. The speech corpus for the unit selection voice recorded by the same speaker is tailored to the typical, more restricted, SmartKom domains. Evaluation tasks focusing on typical SmartKom scenarios demonstrated the superiority of the unit selection voice. In tasks involving open-domain material, however, intelligibility of the unit selection voice appears to be less consistent than that of the diphone voice. In an audio-visual assessment task involving SmartKom specific contexts, the unit selection voice was found to be very well accepted and judged to be satisfactorily intelligible.

Full Paper

Bibliographic reference.  Schweitzer, Antje / Braunschweiler, Norbert / Dogil, Grzegorz / Möbius, Bernd (2004): "Assessing the acceptability of the Smartkom speech synthesis voices", In SSW5-2004, 1-6.