Artificial Bandwidth Extension with Memory Inclusion Using Semi-supervised Stacked Auto-encoders

Pramod Bachhav, Massimiliano Todisco, Nicholas Evans


Artificial bandwidth extension (ABE) algorithms have been developed to improve quality when wideband devices receive speech signals from narrowband devices or infrastructure. The utilisation of contextual information in the form of dynamic features or explicit memory captured from neighbouring frames is common to ABE research, however the use of additional cues augments complexity and can introduce latency. Previous work shows that unsupervised, linear dimensionality reduction techniques help to reduce complexity. This paper reports a semisupervised, non-linear approach to dimensionality reduction using a stacked auto-encoder. In further contrast to previous work, it operates on raw spectra from which a low dimensional narrowband representation is learned in a data-driven manner. Three different objective speech quality measures show that the new features can be used with a standard regression model to improve ABE performance. Improvements in the mutual information between learned features and missing higher frequency components are also observed whereas improvements in speech quality are corroborated by informal listening tests.


 DOI: 10.21437/Interspeech.2018-2213

Cite as: Bachhav, P., Todisco, M., Evans, N. (2018) Artificial Bandwidth Extension with Memory Inclusion Using Semi-supervised Stacked Auto-encoders. Proc. Interspeech 2018, 1185-1189, DOI: 10.21437/Interspeech.2018-2213.


@inproceedings{Bachhav2018,
  author={Pramod Bachhav and Massimiliano Todisco and Nicholas Evans},
  title={Artificial Bandwidth Extension with Memory Inclusion Using Semi-supervised Stacked Auto-encoders},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={1185--1189},
  doi={10.21437/Interspeech.2018-2213},
  url={http://dx.doi.org/10.21437/Interspeech.2018-2213}
}