This paper presents SLAM : a simple method for the automatic Stylization and LAbelling of speech Melody. This main contributions over existing methods are : the alphabet of melodic contours is fully data-driven, an explicit time-frequency representation is used to derive complex melodic contours, and melodic contours can be determined over arbitrary prosodic/syntactic units. Additionally, the system can handle some specificities of spontaneous speech (e.g., multi speakers, speech turns and speech overlaps). A preliminary experiment conducted on 3 hours of spoken French indicates that a small number of contours is sufficient to explain most of the observed contours. The method can be easily adapted to other stressed languages. The implementation is open-source and freely available.
Cite as: Obin, N., Beliao, J., Veaux, C., Lacheret, A. (2014) SLAM: Automatic Stylization and Labelling of Speech Melody. Proc. Speech Prosody 2014, 246-250, doi: 10.21437/SpeechProsody.2014-37
@inproceedings{obin14_speechprosody, author={Nicolas Obin and Julie Beliao and Christophe Veaux and Anne Lacheret}, title={{SLAM: Automatic Stylization and Labelling of Speech Melody}}, year=2014, booktitle={Proc. Speech Prosody 2014}, pages={246--250}, doi={10.21437/SpeechProsody.2014-37} }