5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Modeling Pronunciation Variation for a Dutch CSR: Testing Three Methods

Mirjam Wester, Judith M. Kessens, Helmer Strik

A2RT, University of Nijmegen, The Netherlands

This paper describes how the performance of a continuous speech recognizer for Dutch has been improved by modeling pronunciation variation. We used three methods to model pronunciation variation. First, within-word variation was dealt with. Phonological rules were applied to the words in the lexicon, thus automatically generating pronunciation variants. Secondly, cross-word pronunciation variation was modeled using two different approaches. The first approach was to model cross-word processes by adding the variants as separate words to the lexicon and in the second approach this was done by using multi-words. For each of the methods, recognition experiments were carried out. A significant improvement was found for modeling within-word variation. Furthermore, modeling cross-word processes using multi-words leads to significantly better results than modeling them using separate words in the lexicon.

