16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Incorporating Prosodic Prominence Evidence into Term Weights for Spoken Content Retrieval

David N. Racca, Gareth J. F. Jones

Dublin City University, Ireland

We present an extended technique for spoken content retrieval (SCR) that exploits the prosodic characteristics of spoken terms in order to improve retrieval effectiveness. Our method promotes the rank of speech segments containing a high number of prosodically prominent terms. Given a set of queries and examples of relevant speech segments, we train a classifier to learn differences in the prosodic realisation of spoken terms mentioned in relevant and non-relevant segments. The classifier is trained with a set of lexical and prosodic features that capture local variations of prosodic prominence. For an unseen query, we perform SCR by using an extension of the Okapi BM25 function of probabilistic retrieval that incorporates the prosodic classifier's predictions into the computation of term weights. Experiments with the speech data from the SDPWS corpus of Japanese oral presentations, and the queries and relevance assessment data from the NTCIR SpokenDoc task show that our approach provides improvements over purely text-based SCR approaches.

Full Paper

Bibliographic reference.  Racca, David N. / Jones, Gareth J. F. (2015): "Incorporating prosodic prominence evidence into term weights for spoken content retrieval", In INTERSPEECH-2015, 1378-1382.