ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

A comparison of audio-free speech recognition error prediction methods

Preethi Jyothi, Eric Fosler-Lussier

Predicting possible speech recognition errors can be invaluable for a number of Automatic Speech Recognition (ASR) applications. In this study, we extend a Weighted Finite State Transducer (WFST) framework for error prediction to facilitate a comparison between two approaches of predicting confusable words: examining recognition errors on the training set to learn phone confusions and utilizing distances between the phonetic acoustic models for the prediction task. We also expand the framework to deal with continuous word recognition and we can accurately predict 60% of the misrecognized sentences (with an average words-per-sentence count of 15) and a little over 70% of the total number of errors from the unseen test data where no acoustic information related to the test data is utilized.


doi: 10.21437/Interspeech.2009-350

Cite as: Jyothi, P., Fosler-Lussier, E. (2009) A comparison of audio-free speech recognition error prediction methods. Proc. Interspeech 2009, 1211-1214, doi: 10.21437/Interspeech.2009-350

@inproceedings{jyothi09_interspeech,
  author={Preethi Jyothi and Eric Fosler-Lussier},
  title={{A comparison of audio-free speech recognition error prediction methods}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1211--1214},
  doi={10.21437/Interspeech.2009-350}
}