A computational-phonological method is presented to automatically adapt the phone transcriptions in a lexicon to improve ASR performance in a number of mid-size recognition tasks. The lexical adaptation approach is based on supervised phoneme loops using cd-HMM segments to find alternatives for the transcriptions, and can be considered as a counterpart of the K-means algorithm but on symbolic level. The word error rate in a limited task (digit string recognition) with dialect speakers is shown to drop by 20-25 percent relative, starting from non-dialect digit transcriptions. Since the method is computationally involving, it is only feasible for relatively small tasks.
Cite as: Bosch, L.F.M.t., Cremelie, N. (2001) Pronunciation modeling and lexical adaptation in midsize vocabulary ASR. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 1421-1424, doi: 10.21437/Eurospeech.2001-19
@inproceedings{bosch01_eurospeech, author={Louis F. M. ten Bosch and Nick Cremelie}, title={{Pronunciation modeling and lexical adaptation in midsize vocabulary ASR}}, year=2001, booktitle={Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001)}, pages={1421--1424}, doi={10.21437/Eurospeech.2001-19} }