Improving preselection in unit selection synthesis

Alistair Conkie, Ann Syrdal, Yeon-Jun Kim, Mark Beutnagel

Unit selection synthesis is a method of selecting and concatenating speech segments from a large single-speaker audio database to synthesize utterances. Selection is based on assigning target and concatenation costs to units and then finding a lowest cost sequence of units that will synthesize a given utterance. In order to synthesize efficiently, it is necessary to limit the number of units considered in the unit selection cost network, a part of the process called preselection. This paper examines the role of preselection in unit selection synthesis. We refine the existing process of preselection by adding multiple phone sets to the list of features considered. We present experimental results that demonstrate better database usage and significantly increased synthesis quality using this new method.

doi: 10.21437/Interspeech.2008-172

Cite as: Conkie, A., Syrdal, A., Kim, Y.-J., Beutnagel, M. (2008) Improving preselection in unit selection synthesis. Proc. Interspeech 2008, 585-588, doi: 10.21437/Interspeech.2008-172

