All-Conv Net for Bird Activity Detection: Significance of Learned Pooling

Arjun Pankajakshan, Anshul Thakur, Daksh Thapar, Padmanabhan Rajan, Aditya Nigam

Bird activity detection (BAD) deals with the task of predicting the presence or absence of bird vocalizations in a given audio recording. In this paper, we propose an all-convolutional neural network (all-conv net) for bird activity detection. All the layers of this network including pooling and dense layers are implemented using convolution operations. The pooling operation implemented by convolution is termed as learned pooling. This learned pooling takes into account the inter feature-map correlations which are ignored in traditional max-pooling. This helps in learning a pooling function which aggregates the complementary information in various feature maps, leading to better bird activity detection. Experimental observations confirm this hypothesis. The performance of the proposed all-conv net is evaluated on the BAD Challenge 2017 dataset. The proposed all-conv net achieves state-of-art performance with a simple architecture and does not employ any data pre-processing or data augmentation techniques.

 DOI: 10.21437/Interspeech.2018-1522

Cite as: Pankajakshan, A., Thakur, A., Thapar, D., Rajan, P., Nigam, A. (2018) All-Conv Net for Bird Activity Detection: Significance of Learned Pooling. Proc. Interspeech 2018, 2122-2126, DOI: 10.21437/Interspeech.2018-1522.

  author={Arjun Pankajakshan and Anshul Thakur and Daksh Thapar and Padmanabhan Rajan and Aditya Nigam},
  title={All-Conv Net for Bird Activity Detection: Significance of Learned Pooling},
  booktitle={Proc. Interspeech 2018},