8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Speech Coding and Information Processing by Auditory Neurons

Huan Wang, Werner Hemmert

Infineon Technologies, Germany

One fundamental difference between information processing in the auditory pathway and automatic speech recognition (ASR) systems lies in the coding and processing of nerve-action potentials. Spike trains code amplitude information by means of a rate-code but most information is carried by precise spike timing. In this paper we focus on neurons located in the ventral cochlear nucleus (VCN), which get direct input from primary auditory nerve fibers (ANF). We generate spike trains of the ANFs and VCN neurons with our inner ear model and calculate the transmitted information using a vowel as input stimulus. For ANFs, transmitted information is highest in the frequency range of 200-500 Hz, and decreases towards higher frequencies, due to the degrading temporal precision of the spikes. A single stellate neuron is able to transmit a large portion (up to 66%) of information transmitted by five of its innervating ANFs. Due to their slow membrane time constant the information rate decreases even faster with characteristic frequency (CF) compared to ANFs. The spectral information of sound signals is well reflected in the rate-place code of ANFs and VCN neurons, however, the major part of the information (about 90%) is carried by spike timing. We conclude that we should not neglect this fine-grained temporal information for automatic speech recognition.

Full Paper

Bibliographic reference.  Wang, Huan / Hemmert, Werner (2007): "Speech coding and information processing by auditory neurons", In INTERSPEECH-2007, 426-429.