16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Efficient Use of DNN Bottleneck Features in Generalized Variable Parameter HMMs for Noise Robust Speech Recognition

Rongfeng Su, Xurong Xie, Xunying Liu, Lan Wang

Chinese Academy of Sciences, China

Recently a new approach to incorporate deep neural networks (DNN) bottleneck features into HMM based acoustic models using generalized variable parameter HMMs (GVP-HMMs) was proposed. As Gaussian component level polynomial interpolation is performed for each high dimensional DNN bottleneck feature vector at a frame level, conventional GVP-HMMs are computationally expensive to use in recognition time. To handle this problem, several approaches were exploited in this paper to efficiently use DNN bottleneck features in GVP-HMMs, including model selection techniques to optimally reduce the polynomial degrees; an efficient GMM based bottleneck feature clustering scheme; more compact GVP-HMM trajectory modelling for model space tied linear transformations. These improvements gave a total of 16 time speed up in decoding time over conventional GVP-HMMs using a uniformly assigned polynomial degree. Significant error rate reductions of 15.6% relative were obtained over the baseline tandem HMM system on the secondary microphone channel condition of Aurora 4 task. Consistent improvements were also obtained on other subsets.

Full Paper

Bibliographic reference.  Su, Rongfeng / Xie, Xurong / Liu, Xunying / Wang, Lan (2015): "Efficient use of DNN bottleneck features in generalized variable parameter HMMs for noise robust speech recognition", In INTERSPEECH-2015, 2474-2478.