Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

When Will Synthetic Speech Sound Human: Role of Rules and Data

Jan van Santen, Michael Macon, Andrew Cronk, John-Paul Hosom, Alexander Kain, Vincent Pagel, Johan Wouters

Center for Spoken Language Understanding, Oregon Graduate Institute of Science and Technology, USA

Text-to-speech synthesis research has moved away from building general purpose systems based on an understanding of human language and speech production towards building systems based on statistical algorithms applied to large text and speech corpora, and, recently, towards building such systems for specific domains. Despite substantial progress, the overall quality of even the best systems is often still inadequate for broad user acceptance in applications that cannot also be handled with simple phrase splicing. This tutorial paper analyzes which problems must be addressed to achieve the goal of generating naturalsounding speech in limited domains in a cost-effective way, and the roles of data and rules as we work towards solutions.

Full Paper

Bibliographic reference.  Santen, Jan van / Macon, Michael / Cronk, Andrew / Hosom, John-Paul / Kain, Alexander / Pagel, Vincent / Wouters, Johan (2000): "When will synthetic speech sound human: role of rules and data", In ICSLP-2000, vol.3, 402-409.