7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

HMM COmposition-Based Rapid Model Adaptation Using a Priori Noise GMM Adaptation Evaluation on Aurora2 Corpus

Masaki Ida, Satoshi Nakamura

ATR Spoken Language Translation Research Laboratories, Japan

When a speech recognition system is used in a real environment, its recognition performance is affected by the surrounding noise. Most types of additional noise as well as SNRs are difficult to predict, so there is a mismatch between the training and test data. We need a method to deal with this problem. In this paper, we propose an HMM composition-based model adaptation method with a priori noise GMM adaptation against the mismatch between different types of noise in noisy data. We also prepare multiple HMMs for several SNRs and select the one that can most effectively, based on the acoustic likelihood, deal with unknown SNRs. We carried out speech recognition experiments in noisy environments by using an AURORA2 task test set B. The results show 53% improvement in word accuracy from the baseline system with one-second real noise data used for adaptation. The performance is equivalent that of conventional HMM composition methods using ten-second real data.

Full Paper

Bibliographic reference.  Ida, Masaki / Nakamura, Satoshi (2002): "HMM COmposition-based rapid model adaptation using a priori noise GMM adaptation evaluation on Aurora2 corpus", In ICSLP-2002, 437-440.