ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Multi-resolution front-end for noise robust speech recognition

Ramalingam Hariharan, Imre Kiss, Olli Viikki, Jilei Tian

This paper proposes a new feature extraction approach for noise robust speech recognition. The recent work in multi-band and missing feature theory based Automatic Speech Recognition (ASR) has shown that sub-band processing of speech has certain advantages over the conventional full-band technique. In multiband ASR, different frequency sub-bands are usually decoded independently and a final recognition result is obtained by combining different frequency channels at some temporal level. Since it is not straightforward to determine the optimal combination level, we propose that different sub-band parameters need to be collected into a single feature vector for decoding. As the full-band parameters still carry important information for classification, we suggest that full-band features need to be included in the final feature vector. Our third observation is that the use of PCA transform for de-correlating log-spectral features provides better recognition performance than DCT. The experimental results show that the proposed front-end provides 36.2% improvement in performance over the conventional full-band technique.


Cite as: Hariharan, R., Kiss, I., Viikki, O., Tian, J. (2000) Multi-resolution front-end for noise robust speech recognition. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 550-553

@inproceedings{hariharan00b_icslp,
  author={Ramalingam Hariharan and Imre Kiss and Olli Viikki and Jilei Tian},
  title={{Multi-resolution front-end for noise robust speech recognition}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 3, 550-553}
}