8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Complex Spectrum Circle Centroid for Microphone-Array-Based Noisy Speech Recognition

Shigeki Sagayama, Okajima Takashi, Kamamoto Yutaka, Nishimoto Takuya

The University of Tokyo, Japan

We propose a novel principle based on Complex Spectrum Circle Centroid (CSCC) for restoring complex spectrum of the target (speech) signal from multiple microphone input signals in a noisy environment. If noise arrives at multiple microphones with different time delays relative to the target signal, the observed noisy signals lie on a circle in the complex spectrum plane from which the target signal is restored by finding the centroid of the circle. Unlike most existing methods such as ICA, AMNOR and beamforming, this nonlinear method allows any type of noise including non-stationary, moving, signal-correlated, non-planar, and spoken noises, without identifying the noise direction and training parameters. In speech recognition experiments, the proposed method showed a word accuracy close to the clean speech recognition rate of 89.4% in the case of single noise, and from 0% with one microphone to 60.6% with 8 microphones in the case of 3 spoken noises.

Full Paper

Bibliographic reference.  Sagayama, Shigeki / Takashi, Okajima / Yutaka, Kamamoto / Takuya, Nishimoto (2004): "Complex spectrum circle centroid for microphone-array-based noisy speech recognition", In INTERSPEECH-2004, 825-828.