INTERSPEECH 2013
14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Lexical Stress Detection for L2 English Speech Using Deep Belief Networks

Kun Li, Xiaojun Qian, Shiyin Kang, Helen Meng

Chinese University of Hong Kong, China

This paper investigates lexical stress detection for L2 English speech using Deep Belief Networks (DBNs). The features of the DBN used in this work include the syllable-based prosodic features (assumed to have Gaussian distribution) and their expected lexical stress (assumed to have Bernoulli distribution). As stressed syllables are more prominent than their neighbors, the two preceding and two following syllables are taken into consideration. Experimental results show that the DBN achieves an accuracy of about 80% in syllable stress classification (primary/secondary/no stress) for words with three or more syllables. It outperforms the conventional Gaussian Mixture Model and our previous Prominence Model by an absolute accuracy of about 8% and 4%, respectively.

Full Paper

Bibliographic reference.  Li, Kun / Qian, Xiaojun / Kang, Shiyin / Meng, Helen (2013): "Lexical stress detection for L2 English speech using deep belief networks", In INTERSPEECH-2013, 1811-1815.