INTERSPEECH 2004 - ICSLP
Generalized word posterior probability (GWPP), a confidence measure for verifying recognized words, needs to optimize acoustic and language model weights. In this study, we investigate the word verification error surface and use it to optimize these weights and the corresponding verification threshold in a development set. We test three different search algorithms for finding the optimal parameters, including: a full grid search, a gradient-based steepest descent search, and a downhill simplex search. The three search methods yield very similar solutions. Proper acoustic and language model weights, especially the ratio between them, changes with the relative importance (reliability) between the two knowledge sources. For a narrow beam width, the role of the acoustic model is less critical than language model in GWPP-based word verification, which is due to the noisier acoustic information maintained in a narrow beam. Using a large vocabulary continuous Japanese speech database (Basic Travel Expression Corpus), the largest relative improvement obtained is 33.2% for confidence error rate and 38.7% for a modified word accuracy.
Bibliographic reference. Soong, Frank K. / Lo, Wai Kit / Nakamura, Satoshi (2004): "Optimal acoustic and language model weights for minimizing word verification errors", In INTERSPEECH-2004, 441-444.