Sixth European Conference on Speech Communication and Technology
In this paper we address the problem of information extraction from speech data, particularly improving robustness to automatic recognition errors. We describe a baseline probabilistic model that uses wordclass smoothing in a phrase n-gram language model. The model is adjusted to the error characteristics of a speech recognizer by inserting error tokens in the training data and by using word confidences in decoding to account for possible errors in the recognition output. Experiments show improved performance when training and test conditions are matched.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Palmery, David D. / Ostendorf, Mari / Burgerz, John D. (1999): "Robust information extraction from spoken language data", In EUROSPEECH'99, 1035-1038.