8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Automatic Phonetic Segmentation of Spanish Emotional Speech

A. Gallardo-Antolín (1), R. Barra (2), Marc Schröder (3), Sacha Krstulović (3), J. M. Montero (2)

(1) Universidad Carlos III de Madrid, Spain
(2) Universidad Politécnica de Madrid, Spain
(3) DFKI GmbH, Germany

To achieve high quality synthetic emotional speech, unit-selection is the state-of-the-art technique. Nevertheless, a large expensive phonetically-segmented corpus is needed, and cost-effective automatic techniques should be studied. According to the HMM experiments in this paper: segmentation performance can depend heavily on the segmental or prosodic nature of the intended emotion (segmental emotions are more difficult to segment than prosodic ones), several emotions should be combined to obtain a larger training set (especially when prosodic emotions are involved; this is especially true for small training sets) and a combination of emphatic and non-emphatic emotional recordings (short sentences vs. long paragraphs) can degrade overall performance.

Full Paper

Bibliographic reference.  Gallardo-Antolín, A. / Barra, R. / Schröder, Marc / Krstulović, Sacha / Montero, J. M. (2007): "Automatic phonetic segmentation of Spanish emotional speech", In INTERSPEECH-2007, 2905-2908.