Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Data Collection in Real Acoustical Environments for Sound Scene Understanding and Hands-Free Speech Recognition

Satoshi Nakamura (1), Kazuo Hiyane (2), Futoshi Asano (3), Takeshi Yamada (1), Takashi Endo (4)

(1) Nara Institute of Science and Technology; (2) Mitsubishi Research Institute, Chiyoda, Tokyo; (3) Electrotechnical Laboratory, Tsukuba, Ibaraki; (4) Real World Computing Partnership, Tsukuba, Ibaraki, Japan

This paper describes a sound scene database necessary for studies such as sound source localization, sound re trieval, sound recognition and hands-free speech recognition in real acoustical environments. This paper reports on a project for collection of the sound scene data supported by Real World Computing Partnership(RWCP). There are many kinds of sound scenes in real environments. The sound scene is denoted by sound sources and room acoustics. The number of combination of the sound sources, source positions and rooms is huge in real acoustical environments. Two approaches are taken to build the sound scene database in the early stage of the project. The rst approach is to collect isolated sound sources of many kinds of non-speech sounds and speech sounds. The second approach is to collect impulse responses in various acoustical environments. The sound in the environments can be simulated by convolution of the isolated sound sources and impulse responses. In a later stage, the sound scene data in real acoustical environments is planned to be collected using a three dimensional microphone array. In this paper, the plan and progress of our sound scene database project are described.

Full Paper (PDF)

Bibliographic reference.  Nakamura, Satoshi / Hiyane, Kazuo / Asano, Futoshi / Yamada, Takeshi / Endo, Takashi (1999): "Data collection in real acoustical environments for sound scene understanding and hands-free speech recognition", In EUROSPEECH'99, 2255-2258.