Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
This paper evaluates different ways of "spelling" a word in a speech recognizer's lexicon. In particular, we compare using, as the source of sub-words units for which we build acoustic models, (1) a coarse phonemic representation, (2) a single, fine phonetic realization, and (3) multiple phonetic realizations with associated likelihoods. We describe how we obtain these different pronunciations and we evaluate them on the DARPA Resource Management Task using the word-pair grammar (perplexity 60). We obtain 93.4% word accuracy using phonemic pronunciations, 94.1% using a single phonetic pronunciation per word, and 96.3% using multiple phonetic pronunciations per word with associated likelihoods.
Bibliographic reference. Riley, Michael D. / Ljolje, Andrej (1992): "Recognizing phonemes vs. recognizing phones: a comparison", In ICSLP-1992, 285-288.