ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Large Span statistical language models: application to homophone disambiguation for large vocabulary speech recognition in French

Frédéric Béchet, Alexis Nasr, Thierry Spriet, Renato de Mori

Homophone words is one of the specific problems of Automatic Speech Recognition (ASR) in French. Moreover, this phenomenon is particularly high for some inflections like the singular/plural inflection (72% of the 40.7K lemma of our 240K word dictionary have inflected forms which are homophonic). In order to take into account word-dependencies spanning over a variable number of words, it is interesting to merge local language models, like 3-gram or 3-class models, with large-span models. We present in this paper two kinds of models : a phrase-based model, using phrases obtained from a training corpus by means of a finitestate parser; a homophone cache-based model, using derivation of constraints from word histories stored in a cache memory.


doi: 10.21437/Eurospeech.1999-352

Cite as: Béchet, F., Nasr, A., Spriet, T., Mori, R.d. (1999) Large Span statistical language models: application to homophone disambiguation for large vocabulary speech recognition in French. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 1763-1766, doi: 10.21437/Eurospeech.1999-352

@inproceedings{bechet99_eurospeech,
  author={Frédéric Béchet and Alexis Nasr and Thierry Spriet and Renato de Mori},
  title={{Large Span statistical language models: application to homophone disambiguation for large vocabulary speech recognition in French}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={1763--1766},
  doi={10.21437/Eurospeech.1999-352}
}