Odyssey 2010: The Speaker and Language Recognition Workshop

Brno, Czech Republic
28 June 1 July 2010

The Albayzin 2008 Language Recognition Evaluation

Luis Javier Rodriguez-Fuentes, Mikel Penagarikano, German Bordel, Amparo Varona (1)

(1) University of the Basque Country

The Albayzin 2008 Language Recognition Evaluation was held from May to October 2008, and their results presented and discussed among the participating teams at the 5th Biennial Workshop on Speech Technology, organized by the Spanish Network on Speech Technologies in November 2008. In this paper, we present (for the first time) a full description of the Albayzin 2008 LRE and analyze and discuss recognition results. The evaluation was designed according to the test procedures, protocols and performance measures used in the NIST 2007 LRE. The KALAKA database, consisting of 16 kHz audio signals recorded from TV broadcasts, was created ad-hoc and used for the evaluation. The four official languages spoken in Spain (Basque, Catalan, Galician and Spanish) were taken as target languages, other (unknown) languages being also recorded to allow open-set verification tests. The best system, employing state-of-the-art technology, yielded Cavg=0,0552 (around 5% EER) in closed-set verification tests on a set of 30-second segments. This reveals the difficulty of the task, despite using 16 kHz speech signals and having only four target languages. We plan to include also Portuguese and English as target languages for the next Albayzin 2010 LRE.

Full Paper (PDF)

Bibliographic reference.  Rodriguez-Fuentes, Luis Javier / Penagarikano, Mikel / Bordel, German / Varona, Amparo (2010): "The Albayzin 2008 Language Recognition Evaluation", In Odyssey-2010, paper 031.