![]() |
Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology (PMLA)September 14-15, 2002 |
![]() |
In this paper a previously proposed method for the automatic construction of a lexicon with pronunciation variants for ASR is further developed and evaluated. The basic idea is to transform a lexicon of canonical forms by means of rewrite rules that are learned automatically on a training corpus of orthographically transcribed utterances. The method is evaluated on the TIMIT corpus, using a speech recognizer incorporating context-independent HMMs and a bigram language model. It appears that reductions of the word error rate of up to 35 % are possible to achieve. However, it also appears that it is more likely to obtain much lower gains.
Bibliographic reference. Yang, Qian / Martens, Jean-Pierre / Ghesquiere, Pieter-Jan / Compernolle, Dirk Van (2002): "Pronunciation variation modeling for ASR: large improvements are possible but small ones are likely", In PMLA-2002, 123-128.