![]() |
ASR2000 - Automatic Speech Recognition: Challenges for the new MilleniumSeptember 18-20, 2000 |
![]() |
This paper proposes several new speaker adaptation techniques to improve the large vocabulary continuous speech recognition accuracy. These include, discriminative adaptation, state-quality measure based adaptation, and N-best hypothesis based adaptation schemes. We propose to incorporate the MMIE criterion in the computation of the posterior counts from the adaptation data. We present a new measure, the state quality measure, to evaluate the quality of a HMM state and subsequently use it for selecting good segments of speech during unsupervised adaptation and as a confi- dence measure during decoding/rescoring. The state quality measure is the confidence associated with the acoustic model’s ability to predict the HMM state correctly. It is estimated from the correct and decoded set of transcriptions and is used in conjunction with N-best hypotheses for weighting the state occupancy counts during adaptation. In conjunction with the adaptation schemes, we also present the Viterbi algorithm to estimate the HMM state occupancy counts instead of the Forward-Backward algorithm in order to obtain speed ups without degradation in accuracy. Our results on an in-house spontaneous speech task show improvements in the range of 4% to 14% relative for each of the presented techniques.
Full Paper (PDF) Full Paper (Zipped Postscript)
Bibliographic reference. Gao, Yuqing / Ramabhadran, Bhuvana / Picheny, Michael (2000): "New adaptation techniques for large vocabulary continuous speech recognition", In ASR-2000, 107-111.