ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

The value of auditory offset adaptation and appropriate acoustic modeling

Huan Wang, David Gelbart, Hans-Günter Hirsch, Werner Hemmert

A critical step in encoding sound for neuronal processing occurs when the analog pressure wave is coded into discrete nerve-action potentials. Recent pool models of the inner hair cell synapse do not reproduce the dead time period after an intense stimulus, so we used visual inspection and automatic speech recognition (ASR) to investigate an offset adaptation (OA) model proposed by Zhang et al. [1].

OA improved phase locking in the auditory nerve (AN) and raised ASR accuracy for features derived from AN fibers (ANFs). We also found that OA is crucial for auditory processing by onset neurons (ONs) in the next neuronal stage, the auditory brainstem. Multi-layer perceptrons (MLPs) performed much better than standard Gaussian mixture models (GMMs) for both our ANF-based and ON-based auditory features. Similar results were previously obtained with MSG (Modulation-filtered SpectroGram) auditory features[2]. Thus we believe researchers working with novel features should consider trying MLPs.

s X. Zhang and L. H. Carney, "Analysis of models for the synapse between the inner hair cell and the auditory nerve," J. Acoust. Soc. Am., vol. 118, pp. 1540-53, 2005.

S. Sharma, D. Ellis, S. Kajarekar, P. Jain, and H. Hermansky, "Feature extraction using non-linear transformation for robust speech recognition on the Aurora database," in ICASSP, 2000.


doi: 10.21437/Interspeech.2008-210

Cite as: Wang, H., Gelbart, D., Hirsch, H.-G., Hemmert, W. (2008) The value of auditory offset adaptation and appropriate acoustic modeling. Proc. Interspeech 2008, 902-905, doi: 10.21437/Interspeech.2008-210

@inproceedings{wang08_interspeech,
  author={Huan Wang and David Gelbart and Hans-Günter Hirsch and Werner Hemmert},
  title={{The value of auditory offset adaptation and appropriate acoustic modeling}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={902--905},
  doi={10.21437/Interspeech.2008-210}
}