5th International Conference on Spoken Language Processing
This paper describes a method for selecting units from a database of recorded speech, for use in a concatenative speech synthesiser. The simplest approach is to store one example of every possible unit. A more powerful method is to have multiple examples of each unit. The challenge for such a method is to provide an efficient means of selecting units from a practical inventory, to give the best approximation to the desired sequence in some clearly specified way. The method used in BT's Laureate system uses mixed N-phone units. In theory such units could be of arbitrary size, but in practice they are constrained to a maximum of three phones. It dynamically generates the unit sequence based on a global cost. Units are selected using purely phonologically motivated criteria, without reference to acoustic features, either desired or available within the inventory.
Bibliographic reference. Breen, Andrew P. / Jackson, Peter (1998): "A phonologically motivated method of selecting non-uniform units", In ICSLP-1998, paper 0389.