Large vocabulary continuous speech recognition is particularly difficult for low-resource languages. In the scenario we focus on here is that there is a very limited amount of acoustic training data in the target language, but more plentiful data in other languages. In our approach, we investigate approaches based on Automatic Speech Attribute Transcription (ASAT) framework, and train universal classifiers using multilanguages to learn articulatory features. A hierarchical architecture is applied on both the articulatory feature and phone level, to make the neural network more discriminative. Finally we train the multilayer perceptrons using multi-streams from different languages and obtain MLPs for this low-resource application. In our experiments, we get significant improvements of about 12% relative versus a conventional baseline in this low-resource scenario.
Index Terms: low-resource language; multilayer perceptrons; articulatory features; hierarchical architectures
Bibliographic reference. Qian, Yanmin / Liu, Jia (2012): "Articulatory feature based multilingual MLPs for low-resource speech recognition", In INTERSPEECH-2012, 2602-2605.