Speech Enhancement with Variance Constrained Autoencoders

D.T. Braithwaite, W. Bastiaan Kleijn


Recent machine learning based approaches to speech enhancement operate in the time domain and have been shown to outperform the classical enhancement methods. Two such models are SE-GAN and SE-WaveNet, both of which rely on complex neural network architectures, making them expensive to train. We propose using the Variance Constrained Autoencoder (VCAE) for speech enhancement. Our model uses a more straightforward neural network structure than competing solutions and is a natural model for the task of speech enhancement. We demonstrate experimentally that the proposed enhancement model outperforms SE-GAN and SE-WaveNet in terms of perceptual quality of enhanced signals.


 DOI: 10.21437/Interspeech.2019-1809

Cite as: Braithwaite, D., Kleijn, W.B. (2019) Speech Enhancement with Variance Constrained Autoencoders. Proc. Interspeech 2019, 1831-1835, DOI: 10.21437/Interspeech.2019-1809.


@inproceedings{Braithwaite2019,
  author={D.T. Braithwaite and W. Bastiaan Kleijn},
  title={{Speech Enhancement with Variance Constrained Autoencoders}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={1831--1835},
  doi={10.21437/Interspeech.2019-1809},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1809}
}