Fourth International Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU-2014)
St. Petersburg, Russia
Automatic speech processing technologies hold great potential to facilitate the urgent task of documenting the worlds languages. The present research aims to explore the application of speech recognition tools to a littledocumented language, with a view to facilitating processes of annotation, transcription and linguistic analysis. The target language is Yongning Na (a.k.a. Mosuo), an unwritten Sino-Tibetan language with less than 50,000 speakers. An acoustic model of Na was built using CMU Sphinx. In addition to this light model, trained on a small data set (only 4 hours of speech from 1 speaker), heavyweight models from five national languages (English, French, Chinese, Vietnamese and Khmer) were also applied to the same data. Preliminary results are reported, and perspectives for the long road ahead are outlined.
Index Terms: Acoustic models, automatic speech recognition (ASR), multilingual modelling, under-resourced languages, endangered languages, Yongning Na, Naish languages, language portability, statistical language modeling, crosslingual acoustic modelling and adaptation
Bibliographic reference. Do, Thi-Ngoc-Diep / Michaud, Alexis / Castelli, Eric (2014): "Towards the automatic processing of Yongning Na (sino-tibetan): developing a light acoustic model of the target language and testing heavyweight models from five national languages", In SLTU-2014, 153.