Real-world applications often require tracking multiple moving speakers for improving human-robot interactions and/or sound source separation. This paper presents multiple moving speaker tracking using an 8ch microphone array system installed on a mobile robot. This problem is difficult because the system does not assume that sound sources and/or the microphone array are fixed. Our solutions consist of two key ideas - time delay of arrival estimation, and multiple Kalman filters. The former localizes multiple sound sources based on beamforming in real time. Non-linear movements are tracked by using a set of Kalman filters with different history lengths in order to reduce errors in tracking multiple moving speakers under noisy and echoic environments. For quantitative evaluation of the tracking, motion references of sound sources and a mobile robot, called SIG2, were measured accurately by ultrasonic 3D tag sensors. As a result, we showed that the system tracked three simultaneous sound sources even when SIG2 moved in a room with large reverberation due to glass walls.
Cite as: Murase, M., Yamamoto, S., Valin, J.-M., Nakadai, K., Yamada, K., Komatani, K., Ogata, T., Okuno, H.G. (2005) Multiple moving speaker tracking by microphone array on mobile robot. Proc. Interspeech 2005, 249-252, doi: 10.21437/Interspeech.2005-120
@inproceedings{murase05_interspeech, author={Masamitsu Murase and Shunichi Yamamoto and Jean-Marc Valin and Kazuhiro Nakadai and Kentaro Yamada and Kazunori Komatani and Tetsuya Ogata and Hiroshi G. Okuno}, title={{Multiple moving speaker tracking by microphone array on mobile robot}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={249--252}, doi={10.21437/Interspeech.2005-120} }