Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

A Multimodal Database of Gestures and Speech

Satoru Hayamizu (1), Shigeki Nagaya (2), Keiko Watanuki (3), Masayuki Nakazawa (3), Shuichi Nobe (4), Takashi Yoshimura (1)

(1) Electrotechnical Laboratory; (2) Central Research Laboratory, Hitachi; (3) RWC Multimodal Sharp Laboratory; (4) Aoyama Gakuin University, Japan

This paper describes a multimodal database which consists of image data of human gestures and corresponding speech data for the research on multimodal interaction systems. The purpose of this database is to provide an underlying foundation for research and development of multimodal interactive systems. Our primary concern in selecting utterances and gestures for inclusion in the database was to ascertain the kinds of expressions and gestures that artificial systems could produce and recognize. Total 25 kinds of gestures and speech were repeated four times for the recording of each subject. The speech and gestures for a total of 48 subjects were recorded, converted into files and in the first version, the files for 12 subjects were recorded on CD-ROMs.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Hayamizu, Satoru / Nagaya, Shigeki / Watanuki, Keiko / Nakazawa, Masayuki / Nobe, Shuichi / Yoshimura, Takashi (1999): "A multimodal database of gestures and speech", In EUROSPEECH'99, 2247-2250.