SLTU-2008 - First International Workshop on Spoken Languages Technologies for Under-Resourced Languages

Hanoi, Vietnam
May 5-7, 2008

Unsupervised Adaptive Speech Technology for Limited Resource Languages: A Case Study for Tamil

Özgür Cetin (1), Madelaine Plauché (2), Udhaykumar Nallasamy (3)

(1) Yahoo!, Inc.,Santa Clara, CA, USA
(2) International Computer Science Institute, Berkeley, CA, USA
(3) Carnegie Mellon University, Pittsburgh, PA, USA

This paper evaluates adaptive speech technology for cre- ating low cost, rapidly deployable speech recognizers for new languages with very limited data. A multi-modal (speech and touch) dialog system in Tamil, which delivered agricultural information to rural villagers, is described. Based on the field recordings from this system, a number of automatic speech recognition (ASR) adaptation techniques are compared, in- cluding cross-language transfer (English to Tamil), multilin- gual training, bootstrapping, and model adaptation (super- vised and unsupervised). For this small-vocabulary task, su- pervised model adaptation using a small amount of target speech data yields the best results. In the supervised mode, we find no significant performance difference between adapt- ing models from English, and models from Tamil that used a medium-sized data set at a significant labeling cost. Unsu- pervised adaptation from English yields slightly inferior but comparable recognition results. In summary, we find that model adaptation from a language with existing resources, us- ing a very small amount of target data is a viable option for rapidly building small-vocabulary speech recognizers.

Index Terms— Speech recognition, unsupervised learning.

Full Paper

Bibliographic reference.  Cetin, Özgür / Plauché, Madelaine / Nallasamy, Udhaykumar (2008): "Unsupervised adaptive speech technology for limited resource languages: a case study for Tamil", In SLTU-2008, 98-101.