8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Evaluating Multiple LVCSR Model Combination in NTCIR-3 Speech-Driven Web Retrieval Task

Masahiko Matsushita (1), Hiromitsu Nishizaki (1), Takehito Utsuro (2), Yasuhiro Kodama (1), Seiichi Nakagawa (1)

(1) Toyohashi University of Technology, Japan
(2) Kyoto University, Japan

This paper studies speech-driven Web retrieval models which accepts spoken search topics (queries) in the NTCIR-3 Web retrieval task. The major focus of this paper is on improving speech recognition accuracy of spoken queries and then improving retrieval accuracy in speech-driven Web retrieval. We experimentally evaluate the techniques of combining outputs of multiple LVCSR models in recognition of spoken queries. As model combination techniques, we compare the SVM learning technique and conventional voting schemes such as ROVER. We show that the techniques of multiple LVCSR model combination can achieve improvement both in speech recognition and retrieval accuracies in speech-driven text retrieval. We also show that model combination by SVM learning outperforms conventional voting schemes both in speech recognition and retrieval accuracies.

Full Paper

Bibliographic reference.  Matsushita, Masahiko / Nishizaki, Hiromitsu / Utsuro, Takehito / Kodama, Yasuhiro / Nakagawa, Seiichi (2003): "Evaluating multiple LVCSR model combination in NTCIR-3 speech-driven web retrieval task", In EUROSPEECH-2003, 1205-1208.