Odyssey 2008: The Speaker and Language Recognition Workshop
Stellenbosch, South Africa
We investigate the question of whether phone recognition models trained on large English databases can be used for speaker recognition in another language. Such a crosslanguage use of recognition models is an attractive option when a speaker recognition system is to be ported to a new language without the necessary data resources, while retaining some of the advantages of phone modeling and ASR-based feature extraction. We compare the performance of such systems to a baseline cepstral GMM system (which is inherently language independent), and to a phone-recognition-based system trained exclusively on Arabic data. Our results indicate that cross-language models are highly competitive, and, at least in our case, have a performance advantage over within-language training and the language-independent baseline. We also examine the effect of coverage of colloquial Arabic dialects in the training data.
Full Paper Presentation (PDF)
Bibliographic reference. Stolcke, Andreas / Kajarekar, Sachin (2008): "Recognizing Arabic speakers with English phones", In Odyssey-2008, paper 024.