In this paper, we propose to use speech modulation features for robust nonnative accent detection. Modulation spectrum carries long term temporal information of speech and may discriminate accents of native and nonnative speakers. For each speech segment to be tested, we extract a 10 dimension feature vector from modulation spectrum and use it for model training and testing. The proposed modulation features are compared with other popular features such as pitch and formant on a nonnative French accent detection task. Results show that the modulation features produce good detection performance and are quite robust to channel distortions. In addition, when combine test scores of modulation features and pitch features, performance is further significantly reduced. The best equal error rate is 13.1% by fusing pitch and modulation-based systems.
Bibliographic reference. Sam, Sethserey / Xiao, Xiong / Besacier, Laurent / Castelli, Eric / Li, Haizhou / Chng, Eng Siong (2011): "Speech modulation features for robust nonnative speech accent detection", In INTERSPEECH-2011, 2417-2420.