This paper demonstrates that a low-level, linear description of the response properties of auditory neurons can exhibit some of the high-level properties of the categorical perception of human speech. In particular, it is shown that the non-linearities observed in the human perception of speech sounds which span a categorical boundaries can be understood as arising rather naturally from a low-level statistical description of phonemic contrasts in the time-frequency plane, understood here as the receptive field of auditory neurons. The TIMIT database was used to train a model auditory neuron which discriminates between /s/ and /sh/, and a computer simulation was conducted which demonstrates that the neuron responds categorically to a linear continuum of synthetic fricative sounds which span the /s/-/sh/ boundary. The response of the model provides a good fit to human labeling behavior, and in addition, is able to account for asymmetries in reaction time across the two categories.
Cite as: Neufeld, C. (2017) Modeling Categorical Perception with the Receptive Fields of Auditory Neurons. Proc. Interspeech 2017, 1173-1177, doi: 10.21437/Interspeech.2017-1611
@inproceedings{neufeld17_interspeech, author={Chris Neufeld}, title={{Modeling Categorical Perception with the Receptive Fields of Auditory Neurons}}, year=2017, booktitle={Proc. Interspeech 2017}, pages={1173--1177}, doi={10.21437/Interspeech.2017-1611} }