12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Speech Modulation Features for Robust Nonnative Speech Accent Detection

Sethserey Sam (1), Xiong Xiao (2), Laurent Besacier (1), Eric Castelli (3), Haizhou Li (4), Eng Siong Chng (2)

(1) LIG (UMR 5217), France
(2) Nanyang Technological University, Singapore
(3) MICA, Vietnam
(4) A*STAR, Singapore

In this paper, we propose to use speech modulation features for robust nonnative accent detection. Modulation spectrum carries long term temporal information of speech and may discriminate accents of native and nonnative speakers. For each speech segment to be tested, we extract a 10 dimension feature vector from modulation spectrum and use it for model training and testing. The proposed modulation features are compared with other popular features such as pitch and formant on a nonnative French accent detection task. Results show that the modulation features produce good detection performance and are quite robust to channel distortions. In addition, when combine test scores of modulation features and pitch features, performance is further significantly reduced. The best equal error rate is 13.1% by fusing pitch and modulation-based systems.

Full Paper

Bibliographic reference.  Sam, Sethserey / Xiao, Xiong / Besacier, Laurent / Castelli, Eric / Li, Haizhou / Chng, Eng Siong (2011): "Speech modulation features for robust nonnative speech accent detection", In INTERSPEECH-2011, 2417-2420.