12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Toward a Continuous Modeling of French Prosodic Structure: Using Acoustic Features to Predict Prominence Location and Prominence Degree

Mathieu Avanzi (1), Nicolas Obin (2), Anne Lacheret-Dujour (1), Bernard Victorri (3)

(1) MoDyCo, France
(2) IRCAM, France
(3) LaTTiCe, France

The aim of this paper is to present a tool developed in order to generate French rhythmical structure semi-automatically, without taking grammatical cues into account. On the basis of a phonemic alignment, the software first locates prominent syllables by considering basic acoustic features such as F0, duration and silent pause. It then assigns a degree of prominence to each syllable identified. The estimation of this degree results from a computation of the values of silent pause, relative duration and height averages used for prominence detection in the first step. The second part of the article presents an experiment conducted in order to validate the algorithm's performances, by comparing the predictions of the software with a continuous manual coding carried out by four annotators on a 4-minute stretch of corpus (788 syllables) involving read aloud speech, map task and spontaneous dialogue. The performance of the algorithm is encouraging: a Fleiss' kappa calculation estimates the rate at 0.8, and a correlation agreement calculation at 91%, in the best cases.

Full Paper

Bibliographic reference.  Avanzi, Mathieu / Obin, Nicolas / Lacheret-Dujour, Anne / Victorri, Bernard (2011): "Toward a continuous modeling of French prosodic structure: using acoustic features to predict prominence location and prominence degree", In INTERSPEECH-2011, 2033-2036.