Hierarchical Pooling Structure for Weakly Labeled Sound Event Detection

Ke-Xin He, Yu-Han Shen, Wei-Qiang Zhang


Sound event detection with weakly labeled data is considered as a problem of multi-instance learning. And the choice of pooling function is the key to solving this problem. In this paper, we proposed a hierarchical pooling structure to improve the performance of weakly labeled sound event detection system. Proposed pooling structure has made remarkable improvements on three types of pooling function without adding any parameters. Moreover, our system has achieved competitive performance on Task 4 of Detection and Classification of Acoustic Scenes and Events (DCASE) 2017 Challenge using hierarchical pooling structure.


 DOI: 10.21437/Interspeech.2019-2049

Cite as: He, K., Shen, Y., Zhang, W. (2019) Hierarchical Pooling Structure for Weakly Labeled Sound Event Detection. Proc. Interspeech 2019, 3624-3628, DOI: 10.21437/Interspeech.2019-2049.


@inproceedings{He2019,
  author={Ke-Xin He and Yu-Han Shen and Wei-Qiang Zhang},
  title={{Hierarchical Pooling Structure for Weakly Labeled Sound Event Detection}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={3624--3628},
  doi={10.21437/Interspeech.2019-2049},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2049}
}