INTERSPEECH 2009
10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

A Comparison of Query-by-Example Methods for Spoken Term Detection

Wade Shen, Christopher M. White, Timothy J. Hazen

MIT, USA

In this paper we examine an alternative interface for phonetic search, namely query-by-example, that avoids OOV issues associated with both standard word-based and phonetic search methods. We develop three methods that compare query lattices derived from example audio against a standard ngram-based phonetic index and we analyze factors affecting the performance of these systems. We show that the best systems under this paradigm are able to achieve 77% precision when retrieving utterances from conversational telephone speech and returning 10 results from a single query (performance that is better than a similar dictionarybased approach) suggesting significant utility for applications requiring high precision. We also show that these systems can be further improved using relevance feedback: By incorporating four additional queries the precision of the best system can be improved by 13.7% relative. Our systems perform well despite high phone recognition error rates (> 40%) and make use of no pronunciation or letter-to-sound resources.

Full Paper

Bibliographic reference.  Shen, Wade / White, Christopher M. / Hazen, Timothy J. (2009): "A comparison of query-by-example methods for spoken term detection", In INTERSPEECH-2009, 2143-2146.