Sixth International Conference on Spoken Language Processing (ICSLP 2000)

Beijing, China
October 16-20, 2000

Optimization of Sub-Band Weights Using Simulated Noisy Speech in Multi-Band Speech Recognition

Yik-Cheung Tam, Brian Mak

Department of Computer Science, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong

Recently multi-band speech recognition has been proposed to improve robustness under environmental noises. One important issue is how to combine decisions from individual sub-band recognizers to arrive at a final decision. Under the hidden Markov modeling (HMM) framework, one common approach is combining sub-band likelihoods linearly in an optimal manner so that the more reliable sub-bands are emphasized and the corrupted sub-bands are de-emphasized. In our experience, estimating the weights from clean speech is not e ective as the weights are not optimal under noisy environments. In this paper, we derive the optimal weights from simulated noisy speech using discriminative training method with minimum classification errors (MCE) or maximum mutual information (MMI) as the cost function. The methods are evaluated on recognition of isolated TI digits. Compared with full-band recognition with noises at an SNR of 0dB, multi-band recognition with MCE-derived weights reduces word errors by 45.9% on a tone noise, and an average of 17.9% on three real noises. MCE-derived weights and MMI-derived weights have similar performance, and are much better than weights derived from other means.

Full Paper

Bibliographic reference.  Tam, Yik-Cheung / Mak, Brian (2000): "Optimization of sub-band weights using simulated noisy speech in multi-band speech recognition", In ICSLP-2000, vol.1, 313-316.