EUROSPEECH 2001 Scandinavia
A lot of work has been done in deriving the pronunciation dictionary automatically from training data. These attempts focussed mainly on maximum likelihood or similar techniques. Due to the complexity and variability of the pronunciation process it is difficult to find an adequate pronunciation model. The model will deviate from the truth. Hence, the application of maximum likelihood techniques is likely to be suboptimal. For this reason we present an approach, where the pronunciation model is learned discriminatively from data. The corresponding theory utilizes (1) probabilistic weighting of pronunciation variants of words and (2) discriminative model combination (DMC) based on Viterbi-approximations. We will show that the derived theory adjusts the weighting of pronunciation variants with respect to the word error rate, to the frequency of occurence of the specific pronunciation in the training data, and to the likelihood of the acoustic observation sequence given the pronunciation.
Bibliographic reference. Schramm, Hauke / Beyerlein, Peter (2001): "Towards discriminative lexicon optimization", In EUROSPEECH-2001, 1457-1460.