ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Human speech perception and feature extraction

Bryce E. Lobdell, Mark Hasegawa-Johnson, Jont B. Allen

Speech perception experiments tell us a great deal about which factors affect human performance and behavior. In particular many experiments indicate that the signal-to-noise ratio spectrum is an important factor, indeed the signal-to-noise ratio spectrum is the basis of the Articulation Index, a standard measure of "speech channel capacity." In this paper we compare speech recognition performance for features based on the Articulation Index with two alternatives typically used in speech recognition. The experimental conditions vary the spectrum and level of noise distorting the speech in the training and test set. The perceptually inspired features generally perform better when there is a mismatch between the training and test noise spectrum and level, but worse when the test and training noises match.

doi: 10.21437/Interspeech.2008-494

Cite as: Lobdell, B.E., Hasegawa-Johnson, M., Allen, J.B. (2008) Human speech perception and feature extraction. Proc. Interspeech 2008, 1797-1800, doi: 10.21437/Interspeech.2008-494

  author={Bryce E. Lobdell and Mark Hasegawa-Johnson and Jont B. Allen},
  title={{Human speech perception and feature extraction}},
  booktitle={Proc. Interspeech 2008},