Recently a new approach to incorporate deep neural networks (DNN) bottleneck features into HMM based acoustic models using generalized variable parameter HMMs (GVP-HMMs) was proposed. As Gaussian component level polynomial interpolation is performed for each high dimensional DNN bottleneck feature vector at a frame level, conventional GVP-HMMs are computationally expensive to use in recognition time. To handle this problem, several approaches were exploited in this paper to efficiently use DNN bottleneck features in GVP-HMMs, including model selection techniques to optimally reduce the polynomial degrees; an efficient GMM based bottleneck feature clustering scheme; more compact GVP-HMM trajectory modelling for model space tied linear transformations. These improvements gave a total of 16 time speed up in decoding time over conventional GVP-HMMs using a uniformly assigned polynomial degree. Significant error rate reductions of 15.6% relative were obtained over the baseline tandem HMM system on the secondary microphone channel condition of Aurora 4 task. Consistent improvements were also obtained on other subsets.
Cite as: Su, R., Xie, X., Liu, X., Wang, L. (2015) Efficient use of DNN bottleneck features in generalized variable parameter HMMs for noise robust speech recognition. Proc. Interspeech 2015, 2474-2478, doi: 10.21437/Interspeech.2015-534
@inproceedings{su15b_interspeech, author={Rongfeng Su and Xurong Xie and Xunying Liu and Lan Wang}, title={{Efficient use of DNN bottleneck features in generalized variable parameter HMMs for noise robust speech recognition}}, year=2015, booktitle={Proc. Interspeech 2015}, pages={2474--2478}, doi={10.21437/Interspeech.2015-534} }