The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis

Kyoto, Japan
September 22-24, 2010

Synthesizing Fast Speech by Implementing Multi-Phone Units in Unit Selection Speech Synthesis

Donata Moers (1,2), Igor Jauk (1), Bernd Möbius (1,3), Petra Wagner (2)

(1) Language and Speech Communication Division, University of Bonn, Germany
(2) Faculty of Linguistics and Literature, Bielefeld University, Germany
(3) IMNS, Univerfsity of Stuttgart, Germany

This paper presents a new approach to synthesizing fast speech in unit selection synthesis. After recording two inventories - one at normal and one at fast speech rate articulated as accurately as possible - speech was synthesized from both corpora independently. Since fast speech differs from normal rate speech in terms of acoustic characteristics, the concept of multi-phone (phoxsy) units [1] was implemented and used to synthesize speech at both speaking rates again. A perceptual evaluation showed that phoxsy units enhanced#the iontelligibility for fast speech synthesis significantly.

index Terms: fast speech, unit selection, phoxsy units

Reference

  1. Breuer, S., Abresch, J. "Phoxsy: Multi-phone segments for unit selection speech synthesis. In Interspeech-2004 (ICSLP)

Full Paper

Bibliographic reference.  Moers, Donata / Jauk, Igor / Möbius, Bernd / Wagner, Petra (2010): "Synthesizing fast speech by implementing multi-phone units in unit selection speech synthesis", In SSW7-2010, 355-358.