ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

On the estimation and the use of confusion-matrices for improving ASR accuracy

Omar Caballero Morales, Stephen J. Cox

In previous work, we described how learning the pattern of recognition errors made by an individual using a certain ASR system leads to increased recognition accuracy compared with a standard MLLR adaptation approach. This was the case for low-intelligibility speakers with dysarthric speech, but no improvement was observed for normal speakers. In this paper, we describe an alternative method for obtaining the training data for confusion-matrix estimation for normal speakers which is more effective than our previous technique. We also address the issue of data sparsity in estimation of confusion-matrices by using non-negative matrix factorization (NMF) to discover structure within them. The confusion-matrix estimates made using these techniques are integrated into the ASR process using a technique termed as “metamodels”, and the results presented here show statistically significant gains in word recognition accuracy when applied to normal speech.


doi: 10.21437/Interspeech.2009-207

Cite as: Morales, O.C., Cox, S.J. (2009) On the estimation and the use of confusion-matrices for improving ASR accuracy. Proc. Interspeech 2009, 1599-1602, doi: 10.21437/Interspeech.2009-207

@inproceedings{morales09_interspeech,
  author={Omar Caballero Morales and Stephen J. Cox},
  title={{On the estimation and the use of confusion-matrices for improving ASR accuracy}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1599--1602},
  doi={10.21437/Interspeech.2009-207}
}