KL-Divergence Regularized Deep Neural Network Adaptation for Low-Resource Speaker-Dependent Speech Enhancement

Li Chai, Jun Du, Chin-Hui Lee


In this paper, we propose a Kullback-Leibler divergence (KLD) regularized approach to adapting speaker-independent (SI) speech enhancement model based on regression deep neural networks (DNNs) to another speaker-dependent (SD) model using a tiny amount of speaker-specific adaptation data. This algorithm adapts the DNN model conservatively by forcing the conditional target distribution estimated from the SD model to be close to that from the SI model. The constraint is realized by adding KLD regularization to our previously proposed maximum likelihood objective function. Experimental results demonstrate that, even with only 10 seconds of SD adaptation data, the proposed framework consistently achieves speech intelligibility improvements under all 15 unseen noise types evaluated and at all signal-to-noise ratio levels for all 8 test speakers from the WSJ0 evaluation set.


 DOI: 10.21437/Interspeech.2019-2426

Cite as: Chai, L., Du, J., Lee, C. (2019) KL-Divergence Regularized Deep Neural Network Adaptation for Low-Resource Speaker-Dependent Speech Enhancement. Proc. Interspeech 2019, 1806-1810, DOI: 10.21437/Interspeech.2019-2426.


@inproceedings{Chai2019,
  author={Li Chai and Jun Du and Chin-Hui Lee},
  title={{KL-Divergence Regularized Deep Neural Network Adaptation for Low-Resource Speaker-Dependent Speech Enhancement}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={1806--1810},
  doi={10.21437/Interspeech.2019-2426},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2426}
}