Odyssey 2012 - The Speaker and Language Recognition Workshop

June 25-28, 2012

A Linguistic Data Acquisition Front-End for Language Recognition Evaluation

Gang Liu, Chi Zhang, John H. L. Hansen

Center for Robust Speech Systems (CRSS), Erik Jonsson School of Engineering & Computer Science, University of Texas at Dallas, Richardson, TX, USA

One of the major challenges of the language identification (LID) system comes from the sparse training data. Manually col- lecting the linguistic data through the controlled studio is usu- ally expensive and impractical. But multilingual broadcast pro- grams (Voice of America, for instance) can be collected as a reasonable alternative to the linguistic data acquisition issue. However, unlike studio collected linguistic data, broadcast pro- grams usually contain many contents other than pure linguis- tic data: musical contents in foreground/background, commer- cials, noise from practical life. In this study, a systematic processing approach is proposed to extract the linguistic data from the broadcast media. The experimental results obtained on NIST LRE 2009 data show that the proposed method can provide 22.2% relative improvement of segmentation accuracy and 20.5% relative improvement of LID accuracy.

Full Paper

Bibliographic reference.  Liu, Gang / Zhang, Chi / Hansen, John H. L. (2012): "A linguistic data acquisition front-end for language recognition evaluation", In Odyssey-2012, 224-228.