EUROSPEECH 2003 - INTERSPEECH 2003
This paper studies speech-driven Web retrieval models which accepts spoken search topics (queries) in the NTCIR-3 Web retrieval task. The major focus of this paper is on improving speech recognition accuracy of spoken queries and then improving retrieval accuracy in speech-driven Web retrieval. We experimentally evaluate the techniques of combining outputs of multiple LVCSR models in recognition of spoken queries. As model combination techniques, we compare the SVM learning technique and conventional voting schemes such as ROVER. We show that the techniques of multiple LVCSR model combination can achieve improvement both in speech recognition and retrieval accuracies in speech-driven text retrieval. We also show that model combination by SVM learning outperforms conventional voting schemes both in speech recognition and retrieval accuracies.
Bibliographic reference. Matsushita, Masahiko / Nishizaki, Hiromitsu / Utsuro, Takehito / Kodama, Yasuhiro / Nakagawa, Seiichi (2003): "Evaluating multiple LVCSR model combination in NTCIR-3 speech-driven web retrieval task", In EUROSPEECH-2003, 1205-1208.