We present a system for the classification of intonation patterns in human robot interaction. The system distinguishes questions from other types of utterances and can deal with additional reverberations, background noise, as well as music interfering with the speech signal. The main building blocks of our system are a multi channel source separation, robust fundamental frequency extraction and tracking, segmentation of the speech signal, and classification of the fundamental frequency pattern of the last speech segment. We evaluate the system with Japanese sentences which are ambiguous without intonation information in a realistic human robot interaction scenario. Despite the challenging task our system is able to classify the intonation pattern with good accuracy. With several experiments we evaluate the contribution of the different aspects of our system.
Bibliographic reference. Heckmann, Martin / Nakadai, Kazuhiro / Nakajima, Hirofumi (2011): "Robust intonation pattern classification in human robot interaction", In INTERSPEECH-2011, 3137-3140.