Deep Neural Network (DNN) acoustic models have yielded many state-of-the-art results in Automatic Speech Recognition (ASR) tasks. More recently, Recurrent Neural Network (RNN) models have been shown to outperform DNNs counterparts. However, state-of-the-art DNN and RNN models tend to be impractical to deploy on embedded systems with limited computational capacity. Traditionally, the approach for embedded platforms is to either train a small DNN directly, or to train a small DNN that learns the output distribution of a large DNN. In this paper, we utilize a state-of-the-art RNN to transfer knowledge to small DNN. We use the RNN model to generate soft alignments and minimize the Kullback-Leibler divergence against the small DNN. The small DNN trained on the soft RNN alignments achieved a 3.9 WER on the Wall Street Journal (WSJ) eval92 task compared to a baseline 4.6 WER or more than 13% relative improvement.
Bibliographic reference. Chan, William / Ke, Nan Rosemary / Lane, Ian (2015): "Transferring knowledge from a RNN to a DNN", In INTERSPEECH-2015, 3264-3268.