Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Pronunciation Variation in ASR: Which Variation to Model?

Mirjam Wester, Judith M. Kessens, Helmer Strik

A2RT, Dept. of Language and Speech, University of Nijmegen, the Netherlands

This paper describes how the performance of a continuous speech recognizer for Dutch has been improved by modeling within-word and cross-word pronunciation variation. A relative improvement of 8.8% in WER was found compared to baseline system performance. However, as WERs do not reveal the full effect of modeling pronunciation variation, we performed a detailed analysis of the differences in recognition results that occur due to modeling pronunciation variation and found that indeed a lot of the differences in recognition results are not reflected in the error rates. Furthermore, error analysis revealed that testing sets of variants in isolation does not predict their behavior in combination. However, these results appeared to be corpus dependent.


Full Paper

Bibliographic reference.  Wester, Mirjam / Kessens, Judith M. / Strik, Helmer (2000): "Pronunciation variation in ASR: which variation to model?", In ICSLP-2000, vol.4, 488-491.