14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

The Albayzin 2012 Language Recognition Evaluation

Luis Javier Rodríguez-Fuentes (1), Niko Brümmer (2), Mikel Penagarikano (1), Amparo Varona (1), Germán Bordel (1), Mireia Diez (1)

(1) Universidad del País Vasco, Spain
(2) Agnitio, South Africa

The Albayzin 2012 Language Recognition Evaluation (LRE), carried out from June to October 2012, was the third effort made by the Spanish/Portuguese community for benchmarking language recognition technology. As in previous Albayzin 2008 and 2010 evaluations, the task consisted on deciding whether or not a target language was spoken in a test utterance. The primary condition involved 6 target languages for which there was plenty of training data: English, Portuguese and the four official languages in Spain (Basque, Catalan, Galician and Spanish). A new challenging condition was defined involving 4 target languages for which no training data were available: French, German, Greek and Italian. In both cases, other (Out-Of-Set) languages were also recorded to allow open-set verification tests. An innovative feature of this evaluation, not common to other evaluations, was that audio data for system development and evaluation were extracted from YouTube videos. Also, a new performance metric was proposed, the so called Multiclass Cross-Entropy, summarizing in a single figure the information provided by system scores, without the need to take hard decisions. This paper presents the main features of the evaluation and analyses the performance of the submitted systems on the different conditions, including the confusion among target languages.

Full Paper

Bibliographic reference.  Rodríguez-Fuentes, Luis Javier / Brümmer, Niko / Penagarikano, Mikel / Varona, Amparo / Bordel, Germán / Diez, Mireia (2013): "The albayzin 2012 language recognition evaluation", In INTERSPEECH-2013, 1497-1501.