Due to the large variability of pronunciation in spontaneous speech, pronunciation modeling becomes a more challenging and essential part in speech recognition. In this paper, we describe two different approaches of pronunciation modeling by using decision tree. At lexical level, a pronunciation variation dictionary is built to obtain alternative pronunciations for each word, in which each entry is associated with a variation probability. At decoding level, decision tree pronunciation models are applied to expand the search space to include alternative pronunciations. Relative error reduction of 7.21% and 4.81% could be achieved at lexical level and decoding level respectively. The results at the two different levels are compared and contrasted.
Cite as: Kam, P., Lee, T. (2002) Modeling pronunciation variation for Cantonese speech recognition. Proc. ITRW on Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology (PMLA 2002), 12-17
@inproceedings{kam02_pmla, author={Patgi Kam and Tan Lee}, title={{Modeling pronunciation variation for Cantonese speech recognition}}, year=2002, booktitle={Proc. ITRW on Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology (PMLA 2002)}, pages={12--17} }