Fourth International Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU-2014)

St. Petersburg, Russia
May 14-16, 2014

Speech Recognition and Keyword Spotting for Low-Resource Languages: Babel Project Research at CUED

Mark J. F. Gales, Kate M. Knill, Anton Ragni, Shakti P. Rath

Cambridge University Engineering Department Trumpington Street, Cambridge, UK

Recently there has been increased interest in Automatic Speech Recognition (ASR) and Key Word Spotting (KWS) systems for low resource languages. One of the driving forces for this research direction is the IARPA Babel project. This paper describes some of the research funded by this project at Cambridge University, as part of the Lorelei team co-ordinated by IBM. A range of topics are discussed including: deep neural network based acoustic models; data augmentation; and zero acoustic model resource systems. Performance for all approaches is evaluated using the Limited (approximately 10 hours) and/or Full (approximately 80 hours) language packs distributed by IARPA. Both KWS and ASR performance figures are given. Though absolute performance varies from language to language, and keyword list, the approaches described show consistent trends over the languages investigated to date. Using comparable systems over the five Option Period 1 languages indicates a strong correlation between ASR performance and KWS performance.

Index Terms: keyword spotting, deep neural network, low-resource languages, multi-lingual systems.

Full Paper

Bibliographic reference.  Gales, Mark J. F. / Knill, Kate M. / Ragni, Anton / Rath, Shakti P. (2014): "Speech recognition and keyword spotting for low-resource languages: Babel project research at CUED", In SLTU-2014, 16-23.