ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Speech coding and information processing by auditory neurons

Huan Wang, Werner Hemmert

One fundamental difference between information processing in the auditory pathway and automatic speech recognition (ASR) systems lies in the coding and processing of nerve-action potentials. Spike trains code amplitude information by means of a rate-code but most information is carried by precise spike timing. In this paper we focus on neurons located in the ventral cochlear nucleus (VCN), which get direct input from primary auditory nerve fibers (ANF). We generate spike trains of the ANFs and VCN neurons with our inner ear model and calculate the transmitted information using a vowel as input stimulus. For ANFs, transmitted information is highest in the frequency range of 200-500 Hz, and decreases towards higher frequencies, due to the degrading temporal precision of the spikes. A single stellate neuron is able to transmit a large portion (up to 66%) of information transmitted by five of its innervating ANFs. Due to their slow membrane time constant the information rate decreases even faster with characteristic frequency (CF) compared to ANFs. The spectral information of sound signals is well reflected in the rate-place code of ANFs and VCN neurons, however, the major part of the information (about 90%) is carried by spike timing. We conclude that we should not neglect this fine-grained temporal information for automatic speech recognition.


doi: 10.21437/Interspeech.2007-204

Cite as: Wang, H., Hemmert, W. (2007) Speech coding and information processing by auditory neurons. Proc. Interspeech 2007, 426-429, doi: 10.21437/Interspeech.2007-204

@inproceedings{wang07c_interspeech,
  author={Huan Wang and Werner Hemmert},
  title={{Speech coding and information processing by auditory neurons}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={426--429},
  doi={10.21437/Interspeech.2007-204}
}