This paper investigates a method for training bottleneck (BN) features in a more targeted manner for their intended use in GMM-HMM based ASR. Our approach adds a GMM acoustic model activation layer to a standard BN feature extraction (FE) neural network and performs lattice-based MMI training on the resulting network. After training, the network is reverted back into a working BN FE network by removing the GMM activation layer, and we then train a GMM system on top of the bottleneck features in the normal way. Our results show that this approach can significantly improve recognition accuracy when compared to a baseline system that uses standard BN features. Further, we show that our approach can be used to perform unsupervised speaker adaptation, yielding significantly improved results compared to global cMLLR adaptation.
Bibliographic reference. Paulik, Matthias (2013): "Lattice-based training of bottleneck feature extraction neural networks", In INTERSPEECH-2013, 89-93.