15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Query-by-Example Spoken Term Detection on Multilingual Unconstrained Speech

Xavier Anguera (1), Luis Javier Rodriguez-Fuentes (2), Igor Szőke (3), Andi Buzo (4), Florian Metze (5), Mikel Penagarikano (2)

(1) Telefónica I+D, Spain
(2) Universidad del País Vasco, Spain
(3) Brno University of Technology, Czech Republic
(4) Universitatea Politehnica din Bucureşti, Romania
(5) Carnegie Mellon University, USA

As part of the MediaEval 2013 benchmark evaluation campaign, the objective of the Spoken Web Search (SWS) task was to perform Query-by-Example Spoken Term Detection (QbE-STD) using audio queries in a low-resource setting. After two successful editions and a continuously growing interest in the scientific community, a special effort was made in SWS 2013 to prepare a challenging database, including speech in 9 different languages with diverse environment and channel conditions. In this paper, first we describe the database and the performance metrics. Then, we briefly review the algorithmic approaches followed by participants and present and discuss the obtained performances, which demonstrate the feasibility of the proposed task, even under such challenging conditions (multiple languages and unconstrained acoustic conditions). Finally, we analyze the fusion of the top-performing systems, which achieved a 30% relative improvement over the best single system in the evaluation, proving that a variety of approaches can be effectively combined to bring complementary information in the search for queries.

Full Paper

Bibliographic reference.  Anguera, Xavier / Rodriguez-Fuentes, Luis Javier / Szőke, Igor / Buzo, Andi / Metze, Florian / Penagarikano, Mikel (2014): "Query-by-example spoken term detection on multilingual unconstrained speech", In INTERSPEECH-2014, 2459-2463.