The NIST 2011 Language Recognition Evaluation focuses on language pair discrimination for 24 languages/dialects, some of which may be considered mutually intelligible or closely related. The LRE11 evaluation required new data for all languages, comprising both conversational telephone speech and broadcast narrowband speech from multiple sources in each language. Given the potential confusion among varieties in the collection, manual language auditing required special care including the assessment of inter-auditor consistency. We report on collection methods, auditing approaches, and results.
Cite as: Strassel, S., Walker, K., Jones, K., Graff, D., Cieri, C. (2012) New resources for recognition of confusable linguistic varieties: the LRE11 corpus. Proc. The Speaker and Language Recognition Workshop (Odyssey 2012), 202-208
@inproceedings{strassel12_odyssey, author={Stephanie Strassel and Kevin Walker and Karen Jones and Dave Graff and Christopher Cieri}, title={{New resources for recognition of confusable linguistic varieties: the LRE11 corpus}}, year=2012, booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2012)}, pages={202--208} }