10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

On the Estimation and the Use of Confusion-Matrices for Improving ASR Accuracy

Omar Caballero Morales, Stephen J. Cox

University of East Anglia, UK

In previous work, we described how learning the pattern of recognition errors made by an individual using a certain ASR system leads to increased recognition accuracy compared with a standard MLLR adaptation approach. This was the case for low-intelligibility speakers with dysarthric speech, but no improvement was observed for normal speakers. In this paper, we describe an alternative method for obtaining the training data for confusion-matrix estimation for normal speakers which is more effective than our previous technique. We also address the issue of data sparsity in estimation of confusion-matrices by using non-negative matrix factorization (NMF) to discover structure within them. The confusion-matrix estimates made using these techniques are integrated into the ASR process using a technique termed as “metamodels”, and the results presented here show statistically significant gains in word recognition accuracy when applied to normal speech.

Full Paper

Bibliographic reference.  Morales, Omar Caballero / Cox, Stephen J. (2009): "On the estimation and the use of confusion-matrices for improving ASR accuracy", In INTERSPEECH-2009, 1599-1602.