In language processing applications like speech recognition, printed/handwritten character recognition, or statistical machine translation, the language model usually has a major influence on the performance, by introducing context. An increase of context length usually improves perplexity and increases the accuracy of a classifier using such a language model. In this work, the effect of context reduction, i.e. the accuracy difference between a context sensitive, and a context-insensitive classifier is considered. Context reduction is shown to be related to feature omission in the case of single symbol classification. Therefore, the simplest non-trivial case of feature omission will be analyzed by comparing a feature-aware classifier that uses an emission model to a prior-only classifier that statically infers the prior maximizing class only for all observations. Upper and lower tight bounds are presented for the accuracy difference of these model classifiers. The corresponding analytic proofs, though not presented here, were supported by an extensive simulation analysis of the problem, which gave empirical estimates of the accuracy difference bounds. Further, it is shown that the same bounds, though not tightly, also apply to the original case of context reduction. This result is supported by further simulation experiments for symbol string classification.
Bibliographic reference. Beck, Eugen / Schlüter, Ralf / Ney, Hermann (2015): "Error bounds for context reduction and feature omission", In INTERSPEECH-2015, 1280-1284.