INTERSPEECH 2004 - ICSLP
8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Word Confusability Prediction in Automatic Speech Recognition

Jan Anguita (1), Stephane Peillon (2), Javier Hernando (1), Alexandre Bramoulle (2)

(1) Universitat Politecnica de Catalunya, Spain
(2) Telisma, France

A new method to predict if two words are likely to be confused by an Automatic Speech Recognition (ASR) system is presented in this paper. A new inter-word dissimilarity measure based on Dynamic Time Warping (DTW) is used to classify the word pairs as confusable or not confusable. Firstly, the phonetic transcriptions of the two words to compare are aligned using only phonetic information. After the alignment, the accumulated distance is obtained with a new inter-phone acoustic distance calculated between the Hidden Markov Models (HMM) of the phones. In addition, we have used two different kinds of alignment: either with or without insertions and omissions. In a classical false acceptance/false rejection framework the prediction Equal Error Rate (EER) was measured to be 1.6%, a 50% of reduction with respect to the conventional DTW distance.

Full Paper

Bibliographic reference.  Anguita, Jan / Peillon, Stephane / Hernando, Javier / Bramoulle, Alexandre (2004): "Word confusability prediction in automatic speech recognition", In INTERSPEECH-2004, 1489-1492.