Third Workshop on Spoken Language Technologies for Under-resourced Languages

Cape Town, South Africa
May 7-9, 2012

Automatic Speech Recognition System for Under-Resourced Languages Based on Speeral: Application to Berber Language

Z. Benkhellat (1), E. Ferreira (2), Pascal Nocera (2), M. Guerti (1)

(1) Informatic Department, University of Bejaia, Bejaia, Algeria
(2) Laboratoire Informatique d’Avignon, LIA, Avignon, France

The ability to collect and process a large amount of resources (vocabularies, text corpora, transcribed speech corpora, phonetic dictionaries) constitutes a critical prerequisite of systems based on statistical methods. This problem becomes crucial for languages presenting a lack of computer resources, also known as under-resourced languages, such as African ones. Our work consists in finding an efficient methodology which can improve Speech recognition systems for this kind of languages. This article presents a possible solution proposed for the Berber Language and describe the set of tools used in our study. Namely, we dealt with the problem of insufficient amount of resources by taking into account linguistic specificities of the Berber language and using innovative methods in the building process of ASR resources (acoustic model, lexicon and language model).

Index Terms: Speech recognition, berber language, speeral, under-resourced language

