A Comparative Evaluation of GMM-Free State Tying Methods for ASR

Tamás Grósz, Gábor Gosztolya, László Tóth


Deep neural network (DNN) based speech recognizers have recently replaced Gaussian mixture (GMM) based systems as the state-of-the-art. While some of the modeling techniques developed for the GMM based framework may directly be applied to HMM/DNN systems, others may be inappropriate. One such example is the creation of context-dependent tied states, for which an efficient decision tree state tying method exists. The tied states used to train DNNs are usually obtained using the same tying algorithm, even though it is based on likelihoods of Gaussians, hence it is more appropriate for HMM/GMMs. Recently, however, several refinements have been published which seek to adapt the state tying algorithm to the HMM/DNN hybrid architecture. Unfortunately, these studies reported results on different (and sometimes very small) datasets, which does not allow their direct comparison. Here, we tested four of these methods on the same LVCSR task, and compared their performance under the same circumstances. We found that, besides changing the input of the context-dependent state tying algorithm, it is worth adjusting the tying criterion as well. The methods which utilized a decision criterion designed directly for neural networks consistently, and significantly, outperformed those which employed the standard Gaussian-based algorithm.


 DOI: 10.21437/Interspeech.2017-899

Cite as: Grósz, T., Gosztolya, G., Tóth, L. (2017) A Comparative Evaluation of GMM-Free State Tying Methods for ASR. Proc. Interspeech 2017, 1626-1630, DOI: 10.21437/Interspeech.2017-899.


@inproceedings{Grósz2017,
  author={Tamás Grósz and Gábor Gosztolya and László Tóth},
  title={A Comparative Evaluation of GMM-Free State Tying Methods for ASR},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={1626--1630},
  doi={10.21437/Interspeech.2017-899},
  url={http://dx.doi.org/10.21437/Interspeech.2017-899}
}