International Workshop on Hands-Free Speech Communication (HSC2001)

April 9-11, 2001
Kyoto, Japan

Robust Speech Recognition by Multiple Beamforming with Reflection Signal Equalization

Takanobu Nishiura (1,2), Satoshi Nakamura (1), and Kiyohiro Shikano (2)

(1) ATR Spoken Language Translation Research Laboratories, Kyoto, Japan
(2) Graduate School of Information Science, Nara Institute of Science and Technology, Japan

In real environments, room reverberations seriously degrade the quality in sound capture. To solve this problem, J.L. Flanagan et al. proposed multiple beamforning [1], which forms directivity not only in the direction of the desired sound source but also in the direction of the reflection images. However, it is difficult to actually apply this method in real environments, since this application requires that the distortion of reflection sound signals by wall impedances be equalized. To overcome this problem, we propose a new multiple beamforming algorithm that equalizes the amplitude-spectrum and phase-spectrum of reflection signals by a cross-spectrum [2] method. This paper focuses on the ASR (Automatic Speech Recognition) performance of the proposed multiple beamformen Evaluation expenments are conducted in real environments. In an ASR evaluation, we confirm that WRR (Word Recognition Rate) of a multiple beamformer with equalization improves over that of a single beamformer by 4.7% at 2 meters distance and 6.0% at 3 meters distance from the sound source to a microphone array.


Full Paper

Bibliographic reference.  Nishiura, Takanobu / Nakamura, Satoshi / Shikano, Kiyohiro (2001): "Robust speech recognition by multiple beamforming with reflection signal equalization", In HSC2001, 119-122.