8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

New Word Acquisition Using Subword Modeling

Ghinwa F. Choueiter, Stephanie Seneff, James Glass


In this paper, we use subword modeling to learn the pronunciations and spellings of new words. The subwords are generated with a context-free grammar, and are intermediate units between phonemes and syllables. We first evaluate the effectiveness of the subword model in automatically generating the spelling and pronunciation of new words. Then the subword model is embedded in a multi-stage recognizer which consists of word, subword, and letter recognizers. In a preliminary set of experiments, the hybrid system outperforms a large-vocabulary isolated word recognizer. The subword model is also used to improve the performance of the letter recognizer by generating a spelling cohort which is used to train a small letter n-gram. The small letter n-gram has a reduced perplexity compared to a much larger n-gram, and can be used by the letter recognizer for the spoken spelling mode. This could translate to an improved letter error rate in future letter recognition experiments.

