SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement

Tian Gao, Jun Du, Li-Rong Dai, Chin-Hui Lee


In this paper, we propose a novel progressive learning (PL) framework for deep neural network (DNN) based speech enhancement. It aims at decomposing the complicated regression problem of mapping noisy to clean speech into a series of subproblems for enhancing system performances and reducing model complexities. As an illustration, we design a signal-to-noise ratio (SNR) based PL architecture by guiding each hidden layer of the DNN to learn an intermediate target with gradual SNR gains explicitly. Furthermore, post-processing, with the rich set of information from the multiple learning targets, can further be conducted. Experimental results demonstrate that SNR-based progressive learning can effectively improve perceptual evaluation of speech quality and short-time objective intelligibility in low SNR environments, and reduce the model parameters by 50% when compared with the DNN baseline system. Moreover, when combined with post-processing, the proposed approach can be further improved.


DOI: 10.21437/Interspeech.2016-224

Cite as

Gao, T., Du, J., Dai, L., Lee, C. (2016) SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement. Proc. Interspeech 2016, 3713-3717.

Bibtex
@inproceedings{Gao+2016,
author={Tian Gao and Jun Du and Li-Rong Dai and Chin-Hui Lee},
title={SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-224},
url={http://dx.doi.org/10.21437/Interspeech.2016-224},
pages={3713--3717}
}