Sixth International Conference on Spoken Language Processing
Text-to-speech synthesis research has moved away from building general purpose systems based on an understanding of human language and speech production towards building systems based on statistical algorithms applied to large text and speech corpora, and, recently, towards building such systems for specific domains. Despite substantial progress, the overall quality of even the best systems is often still inadequate for broad user acceptance in applications that cannot also be handled with simple phrase splicing. This tutorial paper analyzes which problems must be addressed to achieve the goal of generating naturalsounding speech in limited domains in a cost-effective way, and the roles of data and rules as we work towards solutions.
Bibliographic reference. Santen, Jan van / Macon, Michael / Cronk, Andrew / Hosom, John-Paul / Kain, Alexander / Pagel, Vincent / Wouters, Johan (2000): "When will synthetic speech sound human: role of rules and data", In ICSLP-2000, vol.3, 402-409.