ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

A high-performance auditory feature for robust speech recognition

Qi Li, Frank K. Soong, Olivier Siohan

An auditory feature extraction algorithm for robust speech recognition in adverse acoustic environments is proposed. Based on the analysis of human auditory system, the feature extraction algorithm consists of several modules: FFT, outer-middle-ear transfer function, frequency conversion from linear to Bark scales, auditory filtering, nonlinearity, and discrete cosine transform. Three recognition experiments have been conducted on connected digit recognition in wireless and land-line communications using handsets and handsfree microphones. Compared to LPCC and MFCC features, the proposed feature has shown 11% to 23% error-rate reductions on average in handset and hands-free acoustic environments in the experiments.


Cite as: Li, Q., Soong, F.K., Siohan, O. (2000) A high-performance auditory feature for robust speech recognition. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 51-54

@inproceedings{li00h_icslp,
  author={Qi Li and Frank K. Soong and Olivier Siohan},
  title={{A high-performance auditory feature for robust speech recognition}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 3, 51-54}
}