Sixth International Conference on Spoken Language Processing
Unit selection-based speech synthesis has recently been the focus of much attention in the speech synthesis community. In general, the speech quality from such a system achieves a high degree of naturalness and good intelligibility. However, examining and selecting units for synthesis as a runtime operation makes the unit selection process computationally expensive. Considerable attention has been focused on reducing the complexity of unit selection while maintaining quality.
Previous approaches to speeding up the process of runtime
unit selection have focused on two aspects. (1) By limiting
the number of candidate synthesis units considered in the
unit selection process, the number of calculations required
can be reduced. (2) By precomputing part of the needed
calculations, the runtime complexity can be reduced. Much
progress has been made using these methods, but usually at
the expense of quality.
We present two methods of reducing the complexity of the calculation that avoid any reduction in synthesis quality, while allowing a very fast unit selection process. Results are presented for the reduction in complexity of the calcu- lation process, and for a perceptual experiment that shows quality is not reduced relative to a full unit selection process.
Bibliographic reference. Conkie, Alistair / Beutnagel, Mark C. / Syrdal, Ann K. / Brown, Philip E. (2000): "Preselection of candidate units in a unit selection-based text-to-speech synthesis system", In ICSLP-2000, vol.3, 314-317.