15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Deep Neural Network Bottleneck Features for Generalized Variable Parameter HMMs

Xurong Xie, Rongfeng Su, Xunying Liu, Lan Wang

Chinese Academy of Sciences, China

Recently deep neural networks (DNNs) have become increasingly popular for acoustic modelling in automatic speech recognition (ASR) systems. As the bottleneck features they produce are inherently discriminative and contain rich hidden factors that influence the surface acoustic realization, the standard approach is to augment the conventional acoustic features with the bottleneck features in a tandem framework. In this paper, an alternative approach to incorporate bottleneck features is investigated. The complex relationship between acoustic features and DNN bottleneck features is modelled using generalized variable parameter HMMs (GVP-HMMs). The optimal GVP-HMM structural configuration and model parameters are automatically learnt. Significant error rate reductions of 48% and 8% relative were obtained over the baseline multi-style HMM and tandem HMM systems respectively on Aurora 2.

Full Paper

Bibliographic reference.  Xie, Xurong / Su, Rongfeng / Liu, Xunying / Wang, Lan (2014): "Deep neural network bottleneck features for generalized variable parameter HMMs", In INTERSPEECH-2014, 2739-2743.