SLTU-2008 - First International Workshop on Spoken Languages Technologies for Under-Resourced Languages
This paper evaluates adaptive speech technology for cre- ating low cost, rapidly deployable speech recognizers for new languages with very limited data. A multi-modal (speech and touch) dialog system in Tamil, which delivered agricultural information to rural villagers, is described. Based on the field recordings from this system, a number of automatic speech recognition (ASR) adaptation techniques are compared, in- cluding cross-language transfer (English to Tamil), multilin- gual training, bootstrapping, and model adaptation (super- vised and unsupervised). For this small-vocabulary task, su- pervised model adaptation using a small amount of target speech data yields the best results. In the supervised mode, we find no significant performance difference between adapt- ing models from English, and models from Tamil that used a medium-sized data set at a significant labeling cost. Unsu- pervised adaptation from English yields slightly inferior but comparable recognition results. In summary, we find that model adaptation from a language with existing resources, us- ing a very small amount of target data is a viable option for rapidly building small-vocabulary speech recognizers.
Index Terms Speech recognition, unsupervised learning.
Bibliographic reference. Cetin, Özgür / Plauché, Madelaine / Nallasamy, Udhaykumar (2008): "Unsupervised adaptive speech technology for limited resource languages: a case study for Tamil", In SLTU-2008, 98-101.