EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Custom-Tailoring TTS Voice Font - Keeping the Naturalness When Reducing Database Size

Yong Zhao, Min Chu, Hu Peng, Eric Chang

Microsoft Research Asia, China

This paper presents a framework for custom-tailoring voice font in data-driven TTS systems. Three criteria for unit pruning, the prosodic outlier criterion, the importance criterion and the combination of the two, are proposed. The performance of voice fonts in different sizes which are pruned with the three criteria is evaluated by simulating speech synthesis over large amount of texts and estimating the naturalness with an objective measure at the same time. The result shows that the combined criterion performs the best among the three. The pre-estimated curve for naturalness vs. database size might be used as a reference for custom-tailoring voice font. The naturalness remains almost unchanged when 50% of instances are pruned off with the combined criterion.

Full Paper

Bibliographic reference.  Zhao, Yong / Chu, Min / Peng, Hu / Chang, Eric (2003): "Custom-tailoring TTS voice font - keeping the naturalness when reducing database size", In EUROSPEECH-2003, 2957-2960.