12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Automatic Prosodic Events Detection by Using Syllable-Based Acoustic, Lexical and Syntactic Features

Chong-Jia Ni, Wenju Liu, Bo Xu

Chinese Academy of Sciences, China

Automatic prosodic events detection and annotation are important for both speech understanding and natural speech synthesis. In this paper, the complementary model method is proposed to detect prosodic events. This method discards the independent assumption between the acoustic features and the lexical and syntactic features, models not only the features of the current syllable but also the contextual features of the current syllable at the model level, and realizes the complementarities by taking the advantages of each model. The experiments on Boston University Radio News Corpus show that the complementary model can yield 91.40% pitch accent detection accuracy rate, 95.19% intonational phrase boundaries (IPB) detection accuracy rate and 93.96% break index detection accuracy rate. When compared with the previous work, the results for pitch accent, IPB and break index detection are significantly better.

Full Paper

Bibliographic reference.  Ni, Chong-Jia / Liu, Wenju / Xu, Bo (2011): "Automatic prosodic events detection by using syllable-based acoustic, lexical and syntactic features", In INTERSPEECH-2011, 2017-2020.