12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Web-Enhanced Content Retrieval for Information Access Dialogue System

Donghyeon Lee (1), Cheongjae Lee (2), Minwoo Jeong (1), Kyungduk Kim (1), Seokhwan Kim (1), Junhwi Choi (1), Gary Geunbae Lee (1)

(1) POSTECH, Korea
(2) Kyoto University, Japan

We consider the problem of content retrieval with complex queries for an information access dialogue system. Traditional information access dialogue systems rely on exact query matching and heuristic rules to find relevant content in a relational database. To deal with complex queries, a dialogue system is used to attain deep semantic processing such as full semantic parsing and ontology-based reasoning. However, these systems require a large amount of semantic annotation and domain expert knowledge that are often very expensive to obtain and thus have been limited in practice. In this paper, we present a simple alternative method where web-searched documents can contribute to enhanced vector space model-based content retrieval. Our model captures underlying co-occurrence patterns between the query and the contents. An efficient ranking algorithm is applied to retrieve the relevant contents. One merit of the proposed approach is that it does not require heavy semantic processing, and therefore, it results in efficient content retrieval. We demonstrate that our method is beneficial in an electronic program-guided dialogue system.

Full Paper

Bibliographic reference.  Lee, Donghyeon / Lee, Cheongjae / Jeong, Minwoo / Kim, Kyungduk / Kim, Seokhwan / Choi, Junhwi / Lee, Gary Geunbae (2011): "Web-enhanced content retrieval for information access dialogue system", In INTERSPEECH-2011, 1297-1300.