Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Combining Multi-Source Far Distance Speech Recognition Strategies: Beamforming, Blind Channel and Confusion Network Combination

Matthias Wölfel, John McDonough

Universität Karlsruhe, Germany

Interest within the automatic speech recognition (ASR) research community has recently focused on the recognition of speech captured with a microphone located in the medium field, rather than being mounted on a headset and positioned next to the speaker's mouth. The capacity to recognize such speech is a primary requirement in making ASR a viable modality for so-called ubiquitous computing. This is a natural application for multiple microphones whose signals can be combined in different ways: On the signal side, combination can be accomplished by beamforming techniques using a microphone array or by blind source separation. On the word hypothesis side, combination can be achieved through confusion network combination. In this work, we compare the effectiveness of the several combination techniques, and compare their performance to that achieved with a close talking microphone.

Full Paper

Bibliographic reference.  Wölfel, Matthias / McDonough, John (2005): "Combining multi-source far distance speech recognition strategies: beamforming, blind channel and confusion network combination", In INTERSPEECH-2005, 3149-3152.