ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

New word acquisition using subword modeling

Ghinwa F. Choueiter, Stephanie Seneff, James Glass

In this paper, we use subword modeling to learn the pronunciations and spellings of new words. The subwords are generated with a context-free grammar, and are intermediate units between phonemes and syllables. We first evaluate the effectiveness of the subword model in automatically generating the spelling and pronunciation of new words. Then the subword model is embedded in a multi-stage recognizer which consists of word, subword, and letter recognizers. In a preliminary set of experiments, the hybrid system outperforms a large-vocabulary isolated word recognizer. The subword model is also used to improve the performance of the letter recognizer by generating a spelling cohort which is used to train a small letter n-gram. The small letter n-gram has a reduced perplexity compared to a much larger n-gram, and can be used by the letter recognizer for the spoken spelling mode. This could translate to an improved letter error rate in future letter recognition experiments.

doi: 10.21437/Interspeech.2007-494

Cite as: Choueiter, G.F., Seneff, S., Glass, J. (2007) New word acquisition using subword modeling. Proc. Interspeech 2007, 1765-1768, doi: 10.21437/Interspeech.2007-494

  author={Ghinwa F. Choueiter and Stephanie Seneff and James Glass},
  title={{New word acquisition using subword modeling}},
  booktitle={Proc. Interspeech 2007},