We focus in this paper on the detection of the emotions in the voice of a speaker in a Human-Robot Interaction context. This work is part of the ROMEO project, which aims to design a robot for both elderly people and children. Our system offers several modules based on a multi-level processing of the audio cues. The affective markers produced by these different modules will allow to pilot the emotional behaviour of the robot. Since the models are built with recording data and the system will test real-life data, we need to estimate our emotion detection system performances in cross-corpus situations. Cross-validation experiments on a three class detection show that derivatives and energy features may be removed from our feature set for this specific task. Cross-corpora experiments on anger-positive-neutral data suggest that detection performances may be better with two different models: one for child voices, one for adult voices.
Bibliographic reference. Tahon, Marie / Delaborde, Agnes / Devillers, Laurence (2011): "Real-life emotion detection from speech in human-robot interaction: experiments across diverse corpora with child and adult voices", In INTERSPEECH-2011, 3121-3124.