The 1st Workshop on Child, Computer and Interaction (WOCCI2008)

Chania, Crete, Greece
October 23, 2008

Does Affect Affect Automatic Recognition of Children’s Speech?

Björn Schuller (1), Anton Batliner (2), Stefan Steidl (2), Dino Seppi (3)

(1) Institute for Human-Machine Communication, Technische Universität München, Munich, Germany
(2) Lehrstuhl für Mustererkennung, Friedrich-Alexander-Universität Erlangen, Germany
(3) Fondazione Bruno Kessler, irst, Trento, Italy

The automatic recognition of children's speech is well known to be a challenge, and so is the influence of affect that is believed to downgrade performance of a speech recogniser. In this contribution, we investigate the combination of these phenomena: extensive test-runs are carried out for 1k vocabulary continuous speech recognition on spontaneous angry, motherese and emphatic children's speech as opposed to neutral speech. The experiments mainly address the questions how specific emotions influence word accuracy, and whether neutral speech material is sufficient for training as opposed to matched conditions acoustic model adaptation. In the result emphatic and angry speech are best recognised, while neutral speech proves a good choice for training. For the discussion of this effect we further visualise emotion distribution in the MFCC space by Sammon transformation.

Full Paper

Bibliographic reference.  Schuller, Björn / Batliner, Anton / Steidl, Stefan / Seppi, Dino (2008): "Does affect affect automatic recognition of children²s speech?", In WOCCI-2008, paper 14.