ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

Environmental robustness in automatic speech recognition using physiologic ally-motivated signal processing

Yoshiaki Ohshima, Richard M. Stern

This paper examines methods by which speech recognition systems can be made more environmentally robust by analyzing the performance of Seneff' s model of auditory periphery [7]. The purpose of the paper is threefold. First, we document the extent to which the Seneff model reduces the degradation in speech recognition accuracy caused by additive noise and/or linear filtering. Second, we examine the extent to which individual components of the nonlinear neural transduction (NT) stage of the Seneff model contribute to recognition accuracy by evaluating the recognition accuracy with individual components of the model removed from the processing. Third, we determine the extent to which the robustness provided by the Seneff model is complementary to and independent of the improvement in recognition accuracy already provided by existing successful acoustical pre-processing algorithms such as codeword-dependent cepstral normalization (CDCN) [1]. Experimental techniques are proposed in the course of investigating the above issues. The results of speech recognition experiments using CMU's SPHINX [4] system under real and simulated degradation are reported.


Cite as: Ohshima, Y., Stern, R.M. (1994) Environmental robustness in automatic speech recognition using physiologic ally-motivated signal processing. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 1347-1350

@inproceedings{ohshima94_icslp,
  author={Yoshiaki Ohshima and Richard M. Stern},
  title={{Environmental robustness in automatic speech recognition using physiologic ally-motivated signal processing}},
  year=1994,
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},
  pages={1347--1350}
}