We address the problem of word confusability in speech recognition by measuring the similarity between Hidden Markov Models (HMMs) using a number of recently developed techniques. The focus is on defining a word confusability that is accurate, in the sense of predicting artificial speech recognition errors, and computationally efficient when applied to speech recognition applications. It is shown by using the edit distance framework for HMMs that we can use statistical information measures of distances between probability distribution functions to define similarity or distance measures between HMMs. We use correlation between errors in a real speech recognizer and the HMM similarities to measure how well each technique works. We demonstrate significant improvements relative to traditional phone confusion weighted edit distance measures by use of a Bhattacharyya divergence-based edit distance.
Bibliographic reference. Chen, Jia-Yu / Olsen, Peder A. / Hershey, John R. (2007): "Word confusability - measuring hidden Markov model similarity", In INTERSPEECH-2007, 2089-2092.