In this paper, we investigate the impact of automatic speech recognition (ASR) errors on the accuracy of topic identification in conversational telephone speech. We present a modified TF-IDF feature weighting calculation that provides significant robustness under various recognition error conditions. For our experiments we take conversations from the Fisher corpus to produce 1-best and lattice outputs using a single recognizer tuned to run at various speeds. We use an SVM classifier to perform topic identification on the output. We observe classifiers incorporating confidence information to be significantly more robust to errors than those treating output as unweighted text.
Bibliographic reference. Wintrode, Jonathan / Kulp, Scott (2009): "Techniques for rapid and robust topic identification of conversational telephone speech", In INTERSPEECH-2009, 1471-1474.