ISCA Archive SpeechProsody 2010
ISCA Archive SpeechProsody 2010

Automatic prominence classification in Swedish

Samer Al Moubayed, G. Ananthakrishnan, Laura Enflo

This study aims at automatically classifying levels of acoustic prominence on a dataset of 200 Swedish sentences of read speech by one male native speaker. Each word in the sentences was categorized by four speech experts into one of three groups depending on the level of prominence perceived. Six acoustic features at a syllable level and seven features at a word level were used. Two machine learning algorithms, namely Support Vector Machines (SVM) and memory based Learning (MBL) were trained to classify the sentences into their respective classes. The MBL gave an average word level accuracy of 69.08% and the SVM gave an average accuracy of 65.17 % on the test set. These values were comparable with the average accuracy of the human annotators with respect to the average annotations. In this study, word duration was found to be the most important feature required for classifying prominence in Swedish read speech.

Index Terms: Swedish prominence, SVM, MBL, syllable and word level features, word duration

Cite as: Al Moubayed, S., Ananthakrishnan, G., Enflo, L. (2010) Automatic prominence classification in Swedish. Proc. Speech Prosody 2010, paper 2002

  author={Samer {Al Moubayed} and G. Ananthakrishnan and Laura Enflo},
  title={{Automatic prominence classification in Swedish}},
  booktitle={Proc. Speech Prosody 2010},
  pages={paper 2002}