Sixth European Conference on Speech Communication and Technology
The problem of unknown words has been addressed using automatically generated filler fragments which augment the lexicon and are incorporated in the language model. These fragments are used to reduce the damage on in-vocabulary words, to detect OOV regions and to provide a phonetic transcription for these regions. The performance of this technique has been evaluated in terms of damage reduction error rate and OOV tagging rate. Significant improvements are reported on both measures. In particular, the influence of an appropriate tuning of the language model factor and word penalties is demonstrated as well as the usefulness of using cross-word triphones over fragments boundaries.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Klakow, Dietrich / Rose, Georg / Aubert, Xavier (1999): "OOV-detection in large vocabulary system using automatically defined word-fragments as fillers", In EUROSPEECH'99, 49-52.