ISCA Archive Interspeech 2017
ISCA Archive Interspeech 2017

Automatic Labelling of Prosodic Prominence, Phrasing and Disfluencies in French Speech by Simulating the Perception of Naïve and Expert Listeners

George Christodoulides, Mathieu Avanzi, Anne Catherine Simon

We explore the use of machine learning techniques (notably SVM classifiers and Conditional Random Fields) to automate the prosodic labelling of French speech, based on modelling and simulating the perception of prosodic events by naïve and expert listeners. The models are based on previous work on the perception of syllabic prominence and hesitation-related disfluencies, and on an experiment on the real-time perception of prosodic boundaries. Expert and non-expert listeners annotated samples from three multi-genre corpora (CPROM, CPROM-PFC, LOCAS-F). Automatic prosodic annotation is approached as a sequence labelling problem, drawing on multiple information sources (acoustic features, lexical and shallow syntactic features) in accordance with the experimental findings showing that listeners integrate all such information in their perception of prosodic segmentation and events. We test combinations of features and machine learning methods, and we compare the automatic labelling with expert annotation. The result of this study is a tool that automatically annotates prosodic events by simulating the perception of expert and naïve listeners.


doi: 10.21437/Interspeech.2017-971

Cite as: Christodoulides, G., Avanzi, M., Simon, A.C. (2017) Automatic Labelling of Prosodic Prominence, Phrasing and Disfluencies in French Speech by Simulating the Perception of Naïve and Expert Listeners. Proc. Interspeech 2017, 3936-3940, doi: 10.21437/Interspeech.2017-971

@inproceedings{christodoulides17_interspeech,
  author={George Christodoulides and Mathieu Avanzi and Anne Catherine Simon},
  title={{Automatic Labelling of Prosodic Prominence, Phrasing and Disfluencies in French Speech by Simulating the Perception of Naïve and Expert Listeners}},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={3936--3940},
  doi={10.21437/Interspeech.2017-971}
}