8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Discriminative Weight Training for Unit-Selection Based Speech Synthesis

Seung Seop Park, Chong Kyu Kim, Nam Soo Kim

Seoul National University, Korea

Concatenative speech synthesis by selecting units from large database has become popular due to its high quality in synthesized speech. The units are selected by minimizing the combination of target and join costs for a given sentence. In this paper, we propose a new approach to train the weight parameters associated with the cost functions used for unit selection in concatenative speech synthesis. We first view the unit selection as a classification problem, and apply the discriminative training technique which is found an efficient way to parameter estimation in speech recognition. Instead of defining an objective function which accounts for the subjective speech quality, we take the classification error as the objective function to be optimized. The classification error is approximated by a smooth function and the relevant parameters are updated by means of the gradient descent technique.

Full Paper

Bibliographic reference.  Park, Seung Seop / Kim, Chong Kyu / Kim, Nam Soo (2003): "Discriminative weight training for unit-selection based speech synthesis", In EUROSPEECH-2003, 281-284.