ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

On integrating insights from human speech perception into automatic speech recognition

Sorin Dusan, Larry R. Rabiner

In spite of the effort and progress made during the last few decades, the performance of automatic speech recognition (ASR) systems still lags far behind that achieved by humans. Some researchers think that more speech data will be sufficient in order to bridge this performance gap. Others think that radical modifications to the current methods need to be made, and possible inspirations for these modifications should come from human speech perception (HSP). This paper focuses on two issues: first, it presents a comparison between HSP and ASR emphasizing some insights from HSP that could still be applied in ASR; second, it presents some ideas for extracting useful non-linguistic information from the speech signal, the so called ‘rich transcription', which could help in selecting specialized acoustic-linguistic models that offer higher accuracy than the general models.


doi: 10.21437/Interspeech.2005-475

Cite as: Dusan, S., Rabiner, L.R. (2005) On integrating insights from human speech perception into automatic speech recognition. Proc. Interspeech 2005, 1233-1236, doi: 10.21437/Interspeech.2005-475

@inproceedings{dusan05_interspeech,
  author={Sorin Dusan and Larry R. Rabiner},
  title={{On integrating insights from human speech perception into automatic speech recognition}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={1233--1236},
  doi={10.21437/Interspeech.2005-475}
}