5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Towards Robust Methods for Spoken Document Retrieval

Kenney Ng

MIT Laboratory for Computer Science, USA

In this paper, we investigate a number of robust indexing and retrieval methods in an effort to improve spoken document retrieval performance in the presence of speech recognition errors. In particular, we examine expanding the original query representation to include confusible terms; developing a new document-query retrieval measure based on approximate matching that is less sensitive to recognition errors; expanding the document representation to include multiple recognition hypotheses; modifying the original query using automatic relevance feedback to include new terms found in the top ranked documents; and combining information from multiple subword unit representations. We study the different methods individually and then explore the effects of combining them. Experiments on radio broadcast news data show that using a combination of these methods can improve retrieval performance by over 20%.

Full Paper

Bibliographic reference.  Ng, Kenney (1998): "Towards robust methods for spoken document retrieval", In ICSLP-1998, paper 1088.