ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

A hierarchical language model incorporating class-dependent word models for OOV words recognition

Koichi Tanigaki, Hirofumi Yamamoto, Yoshinori Sagisaka

A new language model is proposed to cope with the demands for recognizing out-of-vocabulary (OOV) words not registered in the lexicon. This language model is a class N-gram incorporating a set of word models that reflect the statistical characteristics of the phonotactics, which depend on the lexical classes. Utilization of class-dependency enhances recognition accuracy and enables identification of the class of OOV words. OOV words can be recognized as transcribed portions having class labels, which provide semantic attributes of OOV words to subsequent language processing. Experimental application of the model to Japanese personal and family names showed that it performs nearly as well as the upper bound of the in-vocabulary recognition.


Cite as: Tanigaki, K., Yamamoto, H., Sagisaka, Y. (2000) A hierarchical language model incorporating class-dependent word models for OOV words recognition. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 123-126

@inproceedings{tanigaki00_icslp,
  author={Koichi Tanigaki and Hirofumi Yamamoto and Yoshinori Sagisaka},
  title={{A hierarchical language model incorporating class-dependent word models for OOV words recognition}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 3, 123-126}
}