![]() |
InSTIL/ICALL 2004 Symposium on Computer Assisted LearningJune 17-19, 2004 |
![]() |
Practicing the spoken language is important to language learners in critical parts of the United States military. Automatic speech recognition (ASR) is a technology that promises to provide self-paced practice opportunities to language learners. We propose to improve ASR in computer assisted language learning (CALL) applications by modeling the speech behavior of language learners. ASR systems trained on native data perform poorly when used to recognize beginning language learners. The Model Merging adaptation method via a confusion matrix map makes our Arabic speech recognizers more tolerant of Anglophone students.
Hidden Markov Model (HMM) phone sets are trained for English and Arabic, and then English phones are merged into the Arabic phones to make a new Arabic system. A data-driven procedure is presented for automatically mapping phones between two HMM sets.
Accuracy improvements were observed when model merging was combined with other adaptation techniques. The positive results indicate that the speech patterns of non-native speakers are carried over to the new system by the mapping of phones and their weighting.
Bibliographic reference. Morgan, John J. (2004): "Making a speech recognizer tolerate non-native speech through Gaussian mixture merging", In ICALL-2004, paper 052.