Frame-Wise Dynamic Threshold Based Polyphonic Acoustic Event Detection

Xianjun Xia, Roberto Togneri, Ferdous Sohel, David Huang


Acoustic event detection, the determination of the acoustic event type and the localisation of the event, has been widely applied in many real-world applications. Many works adopt multi-label classification techniques to perform the polyphonic acoustic event detection with a global threshold to detect the active acoustic events. However, the global threshold has to be set manually and is highly dependent on the database being tested. To deal with this, we replaced the fixed threshold method with a frame-wise dynamic threshold approach in this paper. Two novel approaches, namely contour and regressor based dynamic threshold approaches are proposed in this work. Experimental results on the popular TUT Acoustic Scenes 2016 database of polyphonic events demonstrated the superior performance of the proposed approaches.


 DOI: 10.21437/Interspeech.2017-746

Cite as: Xia, X., Togneri, R., Sohel, F., Huang, D. (2017) Frame-Wise Dynamic Threshold Based Polyphonic Acoustic Event Detection. Proc. Interspeech 2017, 474-478, DOI: 10.21437/Interspeech.2017-746.


@inproceedings{Xia2017,
  author={Xianjun Xia and Roberto Togneri and Ferdous Sohel and David Huang},
  title={Frame-Wise Dynamic Threshold Based Polyphonic Acoustic Event Detection},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={474--478},
  doi={10.21437/Interspeech.2017-746},
  url={http://dx.doi.org/10.21437/Interspeech.2017-746}
}