ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

A perceptual investigation of speech transcription errors involving frequent near-homophones in French and american English

Ioana Vasilescu, Martine Adda-Decker, Lori Lamel, Pierre Hallé

This article compares the errors made by automatic speech recognizers to those made by humans for near-homophones in American English and French. This exploratory study focuses on the impact of limited word context and the potential resulting ambiguities for automatic speech recognition (ASR) systems and human listeners. Perceptual experiments using 7-gram chunks centered on incorrect or correct words output by an ASR system, show that humans make significantly more transcription errors on the first type of stimuli, thus highlighting the local ambiguity. The long-term aim of this study is to improve the modeling of such ambiguous items in order to reduce ASR errors.


doi: 10.21437/Interspeech.2009-53

Cite as: Vasilescu, I., Adda-Decker, M., Lamel, L., Hallé, P. (2009) A perceptual investigation of speech transcription errors involving frequent near-homophones in French and american English. Proc. Interspeech 2009, 144-147, doi: 10.21437/Interspeech.2009-53

@inproceedings{vasilescu09_interspeech,
  author={Ioana Vasilescu and Martine Adda-Decker and Lori Lamel and Pierre Hallé},
  title={{A perceptual investigation of speech transcription errors involving frequent near-homophones in French and american English}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={144--147},
  doi={10.21437/Interspeech.2009-53}
}