In this paper, we introduce a new approach to dialect recognition that relies on context-dependent (CD) phonetic differences between dialects as well as phonotactics. Given a speech utterance, we obtain the phone sequence using a CD-phone recognizer. We then identify the most likely dialect of these CD-phones using SVM classifiers. Augmenting these phones with the output of these classifiers, we extract augmented phonotactic features which are subsequently given to a logistic regression classifier to obtain a dialect detection score. We test our approach on the task of detecting four Arabic dialects from 30s utterances. We compare our performance to two baselines, PRLM and GMM-UBM, as well as to our own improved version of GMM-UBM which employs fMLLR adaptation. Our approach performs significantly better than all three baselines at 5% absolute Equal Error Rate (EER). The overall EER of our system is 6%.
Cite as: Biadsy, F., Soltau, H., Mangu, L., Navratil, J., Hirschberg, J. (2010) Discriminative Phonotactics for Dialect Recognition Using Context-Dependent Phone Classifiers. Proc. The Speaker and Language Recognition Workshop (Odyssey 2010), paper 44
@inproceedings{biadsy10_odyssey, author={Fadi Biadsy and Hagen Soltau and Lidia Mangu and Jiri Navratil and Julia Hirschberg}, title={{Discriminative Phonotactics for Dialect Recognition Using Context-Dependent Phone Classifiers}}, year=2010, booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2010)}, pages={paper 44} }