8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Building a Test Collection for Speech-Driven Web Retrieval

Atsushi Fujii (1), Katunobu Itou (2)

(1) University of Tsukuba, Japan
(2) AIST, Japan

This paper describes a test collection (benchmark data) for retrieval systems driven by spoken queries. This collection was produced in the subtask of the NTCIR-3 Web retrieval task, which was performed in a TREC-style evaluation workshop. The search topics and document collection for the Web retrieval task were used to produce spoken queries and language models for speech recognition, respectively. We used this collection to evaluate the performance of our retrieval system. Experimental results showed that (a) the use of target documents for language modeling and (b) enhancement of the vocabulary size in speech recognition were effective in improving the system performance.

Full Paper

Bibliographic reference.  Fujii, Atsushi / Itou, Katunobu (2003): "Building a test collection for speech-driven web retrieval", In EUROSPEECH-2003, 1153-1156.