Deep Learning for Orca Call Type Identification — A Fully Unsupervised Approach

Christian Bergler, Manuel Schmitt, Rachael Xi Cheng, Andreas Maier, Volker Barth, Elmar Nöth


Call type classification is an important instrument in bioacoustic research investigating group-specific vocal repertoire, behavioral patterns, and cultures of different animal groups. There is a growing need using robust machine-based techniques to replace human classification due to its advantages in handling large datasets, delivering consistent results, removing perceptual-based classification, and minimizing human errors. The current work is the first adopting a two-stage fully unsupervised approach on previous machine-segmented orca data to identify orca sound types using deep learning together with one of the largest bioacoustic datasets — the Orchive. The proposed methods include: (1) unsupervised feature learning using an undercomplete ResNet18-autoencoder trained on machine-annotated data, and (2) spectral clustering utilizing compressed orca feature representations. An existing human-labeled orca dataset was clustered, including 514 signals distributed over 12 classes. This two-stage fully unsupervised approach is an initial study to (1) examine machine-generated clusters against human-identified orca call type classes, (2) compare supervised call type classification versus unsupervised call type clustering, and (3) verify the general feasibility of a completely unsupervised approach based on machine-labeled orca data resulting in a major progress within the research field of animal linguistics, by deriving a much deeper understanding and facilitating totally new insights and opportunities.


 DOI: 10.21437/Interspeech.2019-1857

Cite as: Bergler, C., Schmitt, M., Cheng, R.X., Maier, A., Barth, V., Nöth, E. (2019) Deep Learning for Orca Call Type Identification — A Fully Unsupervised Approach. Proc. Interspeech 2019, 3357-3361, DOI: 10.21437/Interspeech.2019-1857.


@inproceedings{Bergler2019,
  author={Christian Bergler and Manuel Schmitt and Rachael Xi Cheng and Andreas Maier and Volker Barth and Elmar Nöth},
  title={{Deep Learning for Orca Call Type Identification — A Fully Unsupervised Approach}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={3357--3361},
  doi={10.21437/Interspeech.2019-1857},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1857}
}