This paper presents our experiments in question answering for speech corpora. These experiments focus on improving the answer extraction step of the QA process. We present two approaches to answer extraction in question answering for speech corpora that apply machine learning to improve the coverage and precision of the extraction. The first one is a reranker that uses only lexical information, the second one uses dependency parsing to score similarity between syntactic structures. Our experimental results show that the proposed learning models improve our previous results using only hand-made ranking rules with small syntactic information. We evaluate the system on manual transcripts of speech from EPPS English corpus and a set of questions transcribed from spontaneous oral questions. This data belongs to the CLEF 2009 evaluation track on QA on speech transcripts (QAst).
Bibliographic reference. Comas, Pere R. / Turmo, Jordi / Màrquez, Lluís (2010): "Using dependency parsing and machine learning for factoid question answering on spoken documents", In INTERSPEECH-2010, 1265-1268.