Decision tree-based context clustering is essential but time-consuming while building HMM-based speech synthesis systems. It seeks to cluster HMM states (or streams) based on their context to maximize the log likelihood of the model to the training data. Its widely used implementation is not designed to take advantage of highly parallel architectures, such as GPUs. This paper shows an implementation of tree-based clustering for these highly parallel architectures. Experimental results showed that the new implementation running on GPUs was an order of magnitude faster than the conventional one running on CPUs.
Bibliographic reference. Pilkington, Nicholas / Zen, Heiga (2010): "An implementation of decision tree-based context clustering on graphics processing units", In INTERSPEECH-2010, 833-836.