EUROSPEECH 2003 - INTERSPEECH 2003
Concatenative speech synthesis by selecting units from large database has become popular due to its high quality in synthesized speech. The units are selected by minimizing the combination of target and join costs for a given sentence. In this paper, we propose a new approach to train the weight parameters associated with the cost functions used for unit selection in concatenative speech synthesis. We first view the unit selection as a classification problem, and apply the discriminative training technique which is found an efficient way to parameter estimation in speech recognition. Instead of defining an objective function which accounts for the subjective speech quality, we take the classification error as the objective function to be optimized. The classification error is approximated by a smooth function and the relevant parameters are updated by means of the gradient descent technique.
Bibliographic reference. Park, Seung Seop / Kim, Chong Kyu / Kim, Nam Soo (2003): "Discriminative weight training for unit-selection based speech synthesis", In EUROSPEECH-2003, 281-284.