ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Multi-modal recording, analysis and indexing of poster sessions

Tatsuya Kawahara, Hisao Setoguchi, Katsuya Takanashi, Kentaro Ishizuka, Shoko Araki

A new project on multi-modal analysis of poster sessions is introduced. We have designed an environment dedicated to recording of poster conversations using multiple sensors, and collected a number of sessions, to which a variety of multi-modal information is annotated, including utterance units for individual speakers, backchannels, nodding, gazing, and pointing. Automatic speaker diarization, that is a combination of speech activity detection and speaker identification, is conducted using a set of distant microphones, and a reasonable performance is obtained. Then, we investigate automatic classification of conversation segments into two modes: presentation mode and question-answer mode. Preliminary experiments show that multi-modal features on nonverbal behaviors play a significant role in the indexing of this kind of conversations.

doi: 10.21437/Interspeech.2008-451

Cite as: Kawahara, T., Setoguchi, H., Takanashi, K., Ishizuka, K., Araki, S. (2008) Multi-modal recording, analysis and indexing of poster sessions. Proc. Interspeech 2008, 1622-1625, doi: 10.21437/Interspeech.2008-451

  author={Tatsuya Kawahara and Hisao Setoguchi and Katsuya Takanashi and Kentaro Ishizuka and Shoko Araki},
  title={{Multi-modal recording, analysis and indexing of poster sessions}},
  booktitle={Proc. Interspeech 2008},