Odyssey 2010: The Speaker and Language Recognition Workshop

Brno, Czech Republic
28 June 1 July 2010

Discriminative Phonotactics for Dialect Recognition Using Context-Dependent Phone Classifiers

Fadi Biadsy (1), Hagen Soltau, Lidia Mangu, Jiri Navratil (2), Julia Hirschberg (1)

(1) Columbia University, (2) IBM T. J. Watson Research Center

In this paper, we introduce a new approach to dialect recognition that relies on context-dependent (CD) phonetic differences between dialects as well as phonotactics. Given a speech utterance, we obtain the phone sequence using a CD-phone recognizer. We then identify the most likely dialect of these CD-phones using SVM classifiers. Augmenting these phones with the output of these classifiers, we extract augmented phonotactic features which are subsequently given to a logistic regression classifier to obtain a dialect detection score. We test our approach on the task of detecting four Arabic dialects from 30s utterances. We compare our performance to two baselines, PRLM and GMM-UBM, as well as to our own improved version of GMM-UBM which employs fMLLR adaptation. Our approach performs significantly better than all three baselines at 5% absolute Equal Error Rate (EER). The overall EER of our system is 6%.

Full Paper (PDF)

Bibliographic reference.  Biadsy, Fadi / Soltau, Hagen / Mangu, Lidia / Navratil, Jiri / Hirschberg, Julia (2010): "Discriminative Phonotactics for Dialect Recognition Using Context-Dependent Phone Classifiers", In Odyssey-2010, paper 044.