8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Hierarchical Non-Uniform Unit Selection Based on Prosodic Structure

Jun Xu (1), Dezhi Huang (2), Yongxin Wang (1), Yuan Dong (2), Lianhong Cai (1), Haila Wang (2)

(1) Tsinghua University, China
(2) France Telecom R&D Beijing, China

In speech synthesis systems based on wave concatenation, using longer units can generate more natural synthetic speech. In order to improve the usage of longer units in the corpus, this paper proposed a hierarchical non-uniform unit selection framework. Each layer included in the framework is an independent searching procedure which searches for different sized units and adopts suitable naturalness measuring functions related to the unit type. We have applied it to our Mandarin speech synthesis system according to the Chinese prosodic structure with respect to the statistical result in our corpus. Experiment result shows it outperforms our previous system.

Full Paper

Bibliographic reference.  Xu, Jun / Huang, Dezhi / Wang, Yongxin / Dong, Yuan / Cai, Lianhong / Wang, Haila (2007): "Hierarchical non-uniform unit selection based on prosodic structure", In INTERSPEECH-2007, 2861-2864.