INTERSPEECH 2008
9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Human Speech Perception and Feature Extraction

Bryce E. Lobdell, Mark Hasegawa-Johnson, Jont B. Allen

University of Illinois at Urbana-Champaign, USA

Speech perception experiments tell us a great deal about which factors affect human performance and behavior. In particular many experiments indicate that the signal-to-noise ratio spectrum is an important factor, indeed the signal-to-noise ratio spectrum is the basis of the Articulation Index, a standard measure of "speech channel capacity." In this paper we compare speech recognition performance for features based on the Articulation Index with two alternatives typically used in speech recognition. The experimental conditions vary the spectrum and level of noise distorting the speech in the training and test set. The perceptually inspired features generally perform better when there is a mismatch between the training and test noise spectrum and level, but worse when the test and training noises match.

Full Paper

Bibliographic reference.  Lobdell, Bryce E. / Hasegawa-Johnson, Mark / Allen, Jont B. (2008): "Human speech perception and feature extraction", In INTERSPEECH-2008, 1797-1800.