INTERSPEECH 2004 - ICSLP
This paper describes several experiments aimed at the long term goal of enabling a conversational interface to automatically improve its pronunciation lexicon over time through direct interactions with end users and from available Web sources. We selected a set of 200 rare words from the OGI corpus of spoken names, and performed several experiments combining spelling and pronunciation information to hypothesize phonemic baseforms for these words. We evaluated the quality of the resulting baseforms through a series of recognition experiments, using the 200 words in an isolated word recognition task. Also reported is a modification to the letter-to-sound system, utilizing a letter-phoneme n-gram language model, either alone or in combination with the original "column-bigram" model, for additional linguistic constraint. The experiments confirm that acoustic information drawn from spoken examples of the words can greatly improve the quality of the baseforms, as measured by the recognition error rate.
Bibliographic reference. Chung, Grace / Wang, Chao / Seneff, Stephanie / Filisko, Ed / Tang, Min (2004): "Combining linguistic knowledge and acoustic information in automatic pronunciation lexicon generation", In INTERSPEECH-2004, 1457-1460.