Real-Time Speech Enhancement with GCC-NMF: Demonstration on the Raspberry Pi and NVIDIA Jetson

Sean U.N. Wood, Jean Rouat


We demonstrate a real-time, open source implementation of the online GCC-NMF stereo speech enhancement algorithm. While the system runs on a variety of operating systems and hardware platforms, we highlight its potential for real-world mobile use by presenting it on two embedded systems: the Raspberry Pi 3 and the NVIDIA Jetson TX1. The effect of various algorithm parameters on subjective enhancement quality may be explored interactively via a graphical user interface, with the results heard in real-time. The trade-off between interference suppression and target fidelity is controlled by manipulating the parameters of the coefficient masking function. Increasing the pre-learned dictionary size improves overall speech enhancement quality at increased computational cost. We show that real-time GCC-NMF has potential for real-world application, remaining purely unsupervised and retaining the simplicity and flexibility of offline GCC-NMF.


Cite as: Wood, S.U., Rouat, J. (2017) Real-Time Speech Enhancement with GCC-NMF: Demonstration on the Raspberry Pi and NVIDIA Jetson. Proc. Interspeech 2017, 2048-2049.


@inproceedings{Wood2017,
  author={Sean U.N. Wood and Jean Rouat},
  title={Real-Time Speech Enhancement with GCC-NMF: Demonstration on the Raspberry Pi and NVIDIA Jetson},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2048--2049}
}