A Simple Model for Detection of Rare Sound Events

Weiran Wang, Chieh-Chi Kao, Chao Wang


We propose a simple recurrent model for detecting rare sound events, when the time boundaries of events are available for training. Our model optimizes the combination of an utterance-level loss, which classifies whether an event occurs in an utterance and a frame-level loss, which classifies whether each frame corresponds to the event when it does occur. The two losses make use of a shared vectorial representation the event and are connected by an attention mechanism. We demonstrate our model on Task 2 of the DCASE 2017 challenge and achieve competitive performance.


 DOI: 10.21437/Interspeech.2018-2338

Cite as: Wang, W., Kao, C., Wang, C. (2018) A Simple Model for Detection of Rare Sound Events. Proc. Interspeech 2018, 1344-1348, DOI: 10.21437/Interspeech.2018-2338.


@inproceedings{Wang2018,
  author={Weiran Wang and Chieh-Chi Kao and Chao Wang},
  title={A Simple Model for Detection of Rare Sound Events},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={1344--1348},
  doi={10.21437/Interspeech.2018-2338},
  url={http://dx.doi.org/10.21437/Interspeech.2018-2338}
}