ISCA Archive Odyssey 2010
ISCA Archive Odyssey 2010

The Albayzin 2008 Language Recognition Evaluation

Luis Javier Rodriguez-Fuentes, Mikel Penagarikano, German Bordel, Amparo Varona

The Albayzin 2008 Language Recognition Evaluation was held from May to October 2008, and their results presented and discussed among the participating teams at the 5th Biennial Workshop on Speech Technology, organized by the Spanish Network on Speech Technologies in November 2008. In this paper, we present (for the first time) a full description of the Albayzin 2008 LRE and analyze and discuss recognition results. The evaluation was designed according to the test procedures, protocols and performance measures used in the NIST 2007 LRE. The KALAKA database, consisting of 16 kHz audio signals recorded from TV broadcasts, was created ad-hoc and used for the evaluation. The four official languages spoken in Spain (Basque, Catalan, Galician and Spanish) were taken as target languages, other (unknown) languages being also recorded to allow open-set verification tests. The best system, employing state-of-the-art technology, yielded Cavg=0,0552 (around 5% EER) in closed-set verification tests on a set of 30-second segments. This reveals the difficulty of the task, despite using 16 kHz speech signals and having only four target languages. We plan to include also Portuguese and English as target languages for the next Albayzin 2010 LRE.

Cite as: Rodriguez-Fuentes, L.J., Penagarikano, M., Bordel, G., Varona, A. (2010) The Albayzin 2008 Language Recognition Evaluation. Proc. The Speaker and Language Recognition Workshop (Odyssey 2010), paper 31

  author={Luis Javier Rodriguez-Fuentes and Mikel Penagarikano and German Bordel and Amparo Varona},
  title={{The Albayzin 2008 Language Recognition Evaluation}},
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2010)},
  pages={paper 31}