Sixth European Conference on Speech Communication and Technology
This paper reports results on an experiment to use corrective training techniques for rapid acoustic speaker adaptation in a semi-continuous speech recognition system. Decoder output is used to adjust HMM acoustic models to improve discrimination between correct words and near misses. Twenty sentences are used as an adaptation set. A speech recognizer is run on each utterance to generate a word lattice. The lattice is pruned relative to the correct path. The forward-backward algorithm is used to align each path in the lattice against the speech input and compute observation counts. For each input frame, counts in correct models are adjusted upward, and counts in incorrect models are adjusted downward. The adjusted counts are normalized to generate new observation probabilities for the models. The parameters being adjusted are the mixture weights for the semi-continuous HMMs. The technique reduced word error for a test subject by 37% relative.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Yu, Xiuyang / Ward, Wayne (1999): "Corrective training for speaker adaptation", In EUROSPEECH'99, 2535-2538.