ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

A Deep and Recurrent Architecture for Primate Vocalization Classification

Robert Müller, Steffen Illium, Claudia Linnhoff-Popien

Wildlife monitoring is an essential part of most conservation efforts where one of the many building blocks is acoustic monitoring. Acoustic monitoring has the advantage of being non-invasive and applicable in areas of high vegetation. In this work, we present a deep and recurrent architecture for the classification of primate vocalizations that is based upon well proven modules such as bidirectional Long Short-Term Memory neural networks, pooling, normalized softmax and focal loss. Additionally, we apply Bayesian optimization to obtain a suitable set of hyperparameters. We test our approach on a recently published dataset of primate vocalizations that were recorded in an African wildlife sanctuary. Using an ensemble of the best five models found during hyperparameter optimization on the development set, we achieve a Unweighted Average Recall of 89.3% on the test set. Our approach outperforms the best baseline, an ensemble of various deep and shallow classifiers, which achieves a UAR of 87.5%.

doi: 10.21437/Interspeech.2021-1274

Cite as: Müller, R., Illium, S., Linnhoff-Popien, C. (2021) A Deep and Recurrent Architecture for Primate Vocalization Classification. Proc. Interspeech 2021, 461-465, doi: 10.21437/Interspeech.2021-1274

  author={Robert Müller and Steffen Illium and Claudia Linnhoff-Popien},
  title={{A Deep and Recurrent Architecture for Primate Vocalization Classification}},
  booktitle={Proc. Interspeech 2021},