Acoustic differences are so subtle in a native accent identification (AID) task that a brute force frame-based Gaussian Mixture Model (GMM) fails to discover the tiny distinctions . Apart from the frame-based framework, in this paper we propose a vector-based speaker modeling method, to which common support vector machine (SVM) kernels can be applied. The vector-based speaker model is composed of the concatenation of the average acoustic representations of all phonemes. SVM and GMM classifiers are compared on the speaker models. Moreover, based on the observation that accents only differ in a limited number of phonemes, a variable selection framework is indispensable to select accent relevant features. We investigate a forward selection method, Analysis of Variance (ANOVA) , and a backward selection method, SVM- Recursive Feature Elimination (SVM-RFE). We find that the multiclass SVM-RFE achieves comparable performance with the ANOVA on optimally selected variable sets, while it obtains excellent performance with very few features in low dimensions. Results demonstrate the effectiveness of the proposed speaker models together with the SVM classifier both in low dimensions and in high dimensions as well as the necessity of variable selection.
Bibliographic reference. Wu, Tingyao / Karsmakers, Peter / Van hamme, Hugo / Compernolle, Dirk Van (2008): "Comparison of variable selection methods and classifiers for native accent identification", In INTERSPEECH-2008, 305-308.