![]() |
International Workshop on Hands-Free Speech Communication (HSC2001)April 9-11, 2001 |
![]() |
For hands-free speech recognition, it is desirable to acquire a speech signal of the highest quality possible, and to reduce the mismatch between the test utterance and the acoustic model. In this paper, we present a stochastic approach to integrate acoustic model adaptation and signal enhancement using a microphone array. With this method, it is possible to find speaker directions even at low SNRs. The enhanced speech is recognized by using composite HMMs which are able to represent the statistics of the overlapping speech. When the SNR of the target speaker's speech relative to the interfering speech was 0 dB, the composite-speech HMMs improved the recognition rate to 80.4%. Integrating composite HMMs and a microphone array further improved it to 94.2% - a very respectable improvement over the original 23.0% recognition rate for clean HMMs using a single microphone.
Bibliographic reference. Takiguchi, T. / Nishimura, M. (2001): "Integration of HMM composition and a microphone array for overlapping speech recognition", In HSC2001, 127-130.