8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


New Model-Based HMM Distances with Applications to Run-Time ASR Error Estimation and Model Tuning

Chao-Shih Huang (1), Chin-Hui Lee (2), Hsiao-Chuan Wang (3)

(1) Acer Inc., Taiwan
(2) Georgia Institute of Technology, USA
(3) National Tsing Hua University, Taiwan

We propose a novel model-based HMM distance computation framework to estimate run-time recognition errors and adapt recognition parameters without the need of using any testing or adaptation data. The key idea is to use HMM distances between competing models to measure the confusability between phones in speech recognition. Starting with a set of simulated models in a given noise condition, the corresponding error rate could be estimated with a smooth approximation of the error count computed form the set of phone distances without using any testing data. By minimizing the estimated error between the desired and simulated models, the target model parameters could also be adjusted without using any adaptation data. Experimental results show that the word errors, estimated with the proposed framework, closely resemble the errors obtained by running actual recognition experiments on a large testing set in a number of adverse conditions. The adapted models also gave better recognition performances than those obtained with environment-matched models, especially in low signal-to-noise conditions.

Full Paper

Bibliographic reference.  Huang, Chao-Shih / Lee, Chin-Hui / Wang, Hsiao-Chuan (2003): "New model-based HMM distances with applications to run-time ASR error estimation and model tuning", In EUROSPEECH-2003, 457-460.