Log-linear model combination is the standard approach in LVCSR to combine several knowledge sources, usually an acoustic and a language model. Instead of using a single scaling factor per knowledge source, we make the scaling factor word- and pronunciation-dependent. In this work, we combine three acoustic models, a pronunciation model, and a language model for a Mandarin BN/BC task. The achieved error rate reduction of 2% relative is small but consistent for two test sets. An analysis of the results shows that the major contribution comes from the improved interdependency of language and acoustic model.
Bibliographic reference. Hoffmeister, Björn / Liang, Ruoying / Schlüter, Ralf / Ney, Hermann (2009): "Log-linear model combination with word-dependent scaling factors", In INTERSPEECH-2009, 248-251.