Practical Applicability of Deep Neural Networks for Overlapping Speaker Separation

Pieter Appeltans, Jeroen Zegers, Hugo Van hamme


This paper examines the applicability in realistic scenarios of two deep learning based solutions to the overlapping speaker separation problem. Firstly, we present experiments that show that these methods are applicable for a broad range of languages. Further experimentation indicates limited performance loss for untrained languages, when these have common features with the trained language(s). Secondly, it investigates how the methods deal with realistic background noise and proposes some modifications to better cope with these disturbances. The deep learning methods that will be examined are deep clustering and deep attractor networks.


 DOI: 10.21437/Interspeech.2019-1807

Cite as: Appeltans, P., Zegers, J., hamme, H.V. (2019) Practical Applicability of Deep Neural Networks for Overlapping Speaker Separation. Proc. Interspeech 2019, 1353-1357, DOI: 10.21437/Interspeech.2019-1807.


@inproceedings{Appeltans2019,
  author={Pieter Appeltans and Jeroen Zegers and Hugo Van hamme},
  title={{Practical Applicability of Deep Neural Networks for Overlapping Speaker Separation}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={1353--1357},
  doi={10.21437/Interspeech.2019-1807},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1807}
}