We observed that human listeners distinguish one dialect from another by paying special attention to some particular phonetic and/or phonotactic patterns. Motivated by this observation, we propose a technique that emulates this process. We explore a target-aware lattice rescoring (TALR) process that revises the n-gram statistics in a lattice with target dialect information. We then derive n-gram statistics as the phonotactic features from the lattice and develop a system under the vector space modeling framework. The experiment results show that the proposed technique consistently improves dialect recognition performance on 30-second test utterances. We achieved equal error rates (EERs) of 4.57% and 13.28% with 3-gram statistics for Chinese and English dialect recognition in 2007 NIST Language Recognition Evaluation 30-second closed test sets.
Bibliographic reference. Tong, Rong / Ma, Bin / Li, Haizhou / Chng, Eng Siong (2011): "Target-aware lattice rescoring for dialect recognition", In INTERSPEECH-2011, 733-736.