ISCA Archive SAPA 2004
ISCA Archive SAPA 2004

Multiple-microphone robust speech recognition using decoder-based channel selection

Yasunari Obuchi

In this paper, we focus on speech recognition using multiple microphones with varying quality. The quality of one channel may be much better than other channels and even the output of standard microphone array techniques such as the delay-and-sum beamformer. Therefore, it is important to find a good indicator to select a channel for recognition. This paper introduces Decoder-Based Channel Selection (DBCS) that gives a criterion to evaluate the quality of each channel by comparing the speech recognition hypotheses made from compensated and uncompensated feature vectors. We evaluate the performance of DBCS using speech data recorded by a PDA-like mockup. DBCS with Delta-Cepstrum Normalization for single channel compensation provides significant improvement compared to the delay-and-sum beamformer. In addition, the concept of DBCS is extended to the delayand- sum beamformer outputs of various subset of microphones. This extension gives some additional improvement of the speech recognition accuracy.


Cite as: Obuchi, Y. (2004) Multiple-microphone robust speech recognition using decoder-based channel selection. Proc. ITRW on Statistical and Perceptual Audio Processing (SAPA 2004), paper 52

@inproceedings{obuchi04_sapa,
  author={Yasunari Obuchi},
  title={{Multiple-microphone robust speech recognition using decoder-based channel selection}},
  year=2004,
  booktitle={Proc. ITRW on Statistical and Perceptual Audio Processing (SAPA 2004)},
  pages={paper 52}
}