This paper presents a simple approach to phonotactic dialect recognition which uses lattices of time-synchronous phone co-occurrences at the frame level. In previous works, we successfully applied cross-decoder phone co-occurrences to improve performance in a language recognition experiments on the 2007 NIST LRE database. We call phone co-occurrence to the simultaneous (time-synchronous) presence of two phone units coming from two different phone decoders. In this work, the approach is ported to a Dialect Recognition task based on the assumption that co-occurrences can better represent the tiny differences among the dialects. Besides, a slightly different approach is presented, based on the simultaneous presence of two phone units in the lattice produced by a single decoder (intra-decoder phone co-occurrences). For evaluating the approach, a choice of open software (Brno University of Technology phone decoders, HTK, SRILM, LIBLINEAR and FoCal) was used, and experiments were carried out on the Arabic dialects of the 2011 NIST LRE database. The proposed cross-decoder approach outperformed the baseline phonotactic systems, yielding around 7% relative improvement. The fusion of the baseline system with the proposed approach yielded 7.31% EER and CLLR=0.497 meaning 19% relative improvement.
Index Terms: Phonotactic Dialect Recognition, Phone Cooccurrences, Phone Lattices, Support Vector Machines
Bibliographic reference. Varona, Amparo / Penagarikano, Mikel / Rodriguez-Fuentes, Luis Javier / Bordel, German / Diez, Mireia (2012): "Using time-synchronous phone co-occurrences in a SVM-phonotactic dialect recognition system", In INTERSPEECH-2012, 2069-2072.