Small-Footprint Deep Neural Networks with Highway Connections for Speech Recognition

Liang Lu, Steve Renals


For speech recognition, deep neural networks (DNNs) have significantly improved the recognition accuracy in most of benchmark datasets and application domains. However, compared to the conventional Gaussian mixture models, DNN-based acoustic models usually have much larger number of model parameters, making it challenging for their applications in resource constrained platforms, e.g., mobile devices. In this paper, we study the application of the recently proposed highway network to train small-footprint DNNs, which are thinner and deeper, and have significantly smaller number of model parameters compared to conventional DNNs. We investigated this approach on the AMI meeting speech transcription corpus which has around 80 hours of audio data. The highway neural networks constantly outperformed their plain DNN counterparts, and the number of model parameters can be reduced significantly without sacrificing the recognition accuracy.


DOI: 10.21437/Interspeech.2016-39

Cite as

Lu, L., Renals, S. (2016) Small-Footprint Deep Neural Networks with Highway Connections for Speech Recognition. Proc. Interspeech 2016, 12-16.

Bibtex
@inproceedings{Lu+2016,
author={Liang Lu and Steve Renals},
title={Small-Footprint Deep Neural Networks with Highway Connections for Speech Recognition},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-39},
url={http://dx.doi.org/10.21437/Interspeech.2016-39},
pages={12--16}
}