8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

The Efficient Generation of Pronunciation Dictionaries: Machine Learning Factors during Bootstrapping

Marelie Davel, Etienne Barnard

CSIR/University of Pretoria, South Africa

Several factors affect the efficiency of bootstrapping approaches to the generation of pronunciation dictionaries. We focus on factors related to the underlying rule-extraction algorithms, and demonstrate variants of the Dynamically Expanding Context algorithm which are beneficial for this application. In particular, we show that continuous updating of the learned rules, coupled with a new approach to phoneme alignment and a sliding-window approach to choosing the context window, leads to an efficient and accurate bootstrapping mechanism.

Full Paper

Bibliographic reference.  Davel, Marelie / Barnard, Etienne (2004): "The efficient generation of pronunciation dictionaries: machine learning factors during bootstrapping", In INTERSPEECH-2004, 2781-2784.