ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Combined low level and high level features for out-of-vocabulary word detection

Benjamin Lecouteux, Georges Linarès, Benoit Favre

This paper addresses the issue of Out-Of-Vocabulary (OOV) word detection in Large Vocabulary Continuous Speech Recognition (LVCSR) systems. We propose a method inspired by confidence measures, that consists in analyzing the recognition system outputs in order to automatically detect errors due to OOV words. This method combines various features based on acoustic, linguistic, decoding graph and semantics. We evaluate separately each feature and we estimate their complementarity. Experiments are conducted on a large French broadcast news corpus from the ESTER evaluation campaign. Results show good performance in real conditions: the method obtains an OOV word detection rate of 43%–90% with 2.5%–17.5% of false detection.


doi: 10.21437/Interspeech.2009-344

Cite as: Lecouteux, B., Linarès, G., Favre, B. (2009) Combined low level and high level features for out-of-vocabulary word detection. Proc. Interspeech 2009, 1187-1190, doi: 10.21437/Interspeech.2009-344

@inproceedings{lecouteux09_interspeech,
  author={Benjamin Lecouteux and Georges Linarès and Benoit Favre},
  title={{Combined low level and high level features for out-of-vocabulary word detection}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1187--1190},
  doi={10.21437/Interspeech.2009-344}
}