Modern speech recognition systems typically cluster triphone phonetic contexts using decision trees. In this paper we describe a way to build multiple complementary decision trees from the same data, for the purpose of system combination. We do this by jointly building the decision trees using an objective function that has an added entropy term to encourage diversity among the decision trees. After the trees are built, the systems are built in the standard way and the emission probabilities are combined during decoding. Experiments on multiple datasets show gains from the use of multiple trees, at the expense of evaluating multiple models in test time.
Bibliographic reference. Xu, Hainan / Chen, Guoguo / Povey, Daniel / Khudanpur, Sanjeev (2015): "Modeling phonetic context with non-random forests for speech recognition", In INTERSPEECH-2015, 2117-2121.