Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Preselection of Candidate Units in a Unit Selection-Based Text-To-Speech Synthesis System

Alistair Conkie, Mark C. Beutnagel, Ann K. Syrdal, Philip E. Brown

AT&T Labs - Research, Florham Park, NJ, USA

Unit selection-based speech synthesis has recently been the focus of much attention in the speech synthesis community. In general, the speech quality from such a system achieves a high degree of naturalness and good intelligibility. However, examining and selecting units for synthesis as a runtime operation makes the unit selection process computationally expensive. Considerable attention has been focused on reducing the complexity of unit selection while maintaining quality.

Previous approaches to speeding up the process of runtime unit selection have focused on two aspects. (1) By limiting the number of candidate synthesis units considered in the unit selection process, the number of calculations required can be reduced. (2) By precomputing part of the needed calculations, the runtime complexity can be reduced. Much progress has been made using these methods, but usually at the expense of quality.
We present two methods of reducing the complexity of the calculation that avoid any reduction in synthesis quality, while allowing a very fast unit selection process. Results are presented for the reduction in complexity of the calcu- lation process, and for a perceptual experiment that shows quality is not reduced relative to a full unit selection process.


Full Paper

Bibliographic reference.  Conkie, Alistair / Beutnagel, Mark C. / Syrdal, Ann K. / Brown, Philip E. (2000): "Preselection of candidate units in a unit selection-based text-to-speech synthesis system", In ICSLP-2000, vol.3, 314-317.