![]() |
ITRW on
|
![]() |
Pronunciation dictionaries are the interface between orthographic and phonetic representation of the speech signal and are thereby a substantial component of speech recognition systems. In many systems simple canonical pronunciation forms are used within the dictionary. They represent the "correct" pronunciation as they are found in lexicons and neither contain the most frequent pronunciation nor pronunciation variations. Often canonical dictionaries are manually extended by pronunciation variants to improve the performance of the dictionary. This is a time consuming process that depends on the skill and expert knowledge of the scientist. Another popular approach utilizes rules for the generation of pronunciation variants. Unless the rule set contains a large quantity of specialized rules with a very long context, the major disadvantage of this approach lies in its tendency to overgeneralize. Also rule sets often do not provide information on the probability of usage of the rules. A third frequent approach is the data-driven generation of rules sets or pronunciation models. These methods try to eliminate manual work (and human errors) and allow a scalable degree of generalization as well as the estimation of rule application or variant probabilities.
In this paper we introduce a method for automatic training of pronunciation dictionaries from a speech database. We will give a overview of our training procedure and discuss experimental results.
Bibliographic reference. Wolff, Matthias / Eichner, Matthias / Hoffmann, Rüdiger (2001): "Automatic learning and optimization of pronunciation dictionaries", In Adaptation-2001, 159-162.