7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Probabilistic Retrieval Based on Document Representations

Wolfgang Macherey, Jörg Viechtbauer, Hermann Ney

RWTH Aachen - University of Technology, Germany

Accessing information in multimedia databases encompasses a wide range of applications in which spoken document retrieval (SDR) plays an important role. In the recent past, research increasingly focused on the development of heuristic and probabilistic retrieval metrics that are suitable for retrieving spoken documents. So far, many heuristic retrieval metrics, e.g. the SMART-2 metric, have been proven to be more efficient than most advanced statistical approaches to SDR. In this paper, we propose a new probabilistic approach that is based on interpolations between document representations. This approach can be interpreted as a sort of nearest neighbor concept between documents, where a query is treated as a document. Experiments performed on the TREC-7 and TREC-8 SDR task show comparable or even better results than the SMART-2 metric.

Full Paper

Bibliographic reference.  Macherey, Wolfgang / Viechtbauer, Jörg / Ney, Hermann (2002): "Probabilistic retrieval based on document representations", In ICSLP-2002, 1481-1484.