ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Within-class feature normalization for robust speech recognition

Yuan-Fu Liao, Chi-Hui Hsu, Chi-Min Yang, Jeng-Shien Lin, Sen-Chia Chang

In this paper, a within-class feature normalization (WCFN) framework operating in transformed segment-level (instead of frame-level) super-vector space is proposed for robust speech recognition. In this framework, each segment hypothesis in a lattice is represented by a high dimensional super-vector and projected to a class-dependent lower-dimensional eigen-subspace to remove unwanted variability due to environment noise and speaker (different values of SNR, gender, types of noise and so on). The normalized super-vectors are verified by a bank of class detectors to further rescore the lattice. Experimental results on Aurora 2 multi-condition training task showed that the proposed WCFN approach achieved 7.45% average word error rate (WER). WCFN not only outperformed the multi-condition training baseline (Multi-Con., 13.72%) but also the blind ETSI advanced DSR front-end (ETSI-Adv., 8.65%), the histogram equalization (HEQ, 8.66%) and the non-blind reference model weighting (RMW, 7.29%) approaches.

doi: 10.21437/Interspeech.2008-296

Cite as: Liao, Y.-F., Hsu, C.-H., Yang, C.-M., Lin, J.-S., Chang, S.-C. (2008) Within-class feature normalization for robust speech recognition. Proc. Interspeech 2008, 1020-1023, doi: 10.21437/Interspeech.2008-296

  author={Yuan-Fu Liao and Chi-Hui Hsu and Chi-Min Yang and Jeng-Shien Lin and Sen-Chia Chang},
  title={{Within-class feature normalization for robust speech recognition}},
  booktitle={Proc. Interspeech 2008},