Robust Acoustic Event Classification Using Bag-of-Visual-Words

Manjunath Mulimani, Shashidhar G Koolagudi


This paper presents a novel Bag-of-Visual-Words (BoVW) approach, to represent the grayscale spectrograms of acoustic events. Such, BoVW representations are referred as histograms of visual features, used for Acoustic Event Classification (AEC). Further, Chi-square distance between histograms of visual features evaluated, which generates kernel to Support Vector Machines (Chi-square SVM) classifier. Evaluation of the proposed histograms of visual features together with Chi-square SVM classifier is conducted on different categories of acoustic events from UPC-TALP corpora in clean and different noise conditions. Results show that proposed approach is more robust to noise and achieves improved recognition accuracy compared to other methods.


 DOI: 10.21437/Interspeech.2018-1905

Cite as: Mulimani, M., Koolagudi, S.G. (2018) Robust Acoustic Event Classification Using Bag-of-Visual-Words. Proc. Interspeech 2018, 3319-3322, DOI: 10.21437/Interspeech.2018-1905.


@inproceedings{Mulimani2018,
  author={Manjunath Mulimani and Shashidhar G Koolagudi},
  title={Robust Acoustic Event Classification Using Bag-of-Visual-Words},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={3319--3322},
  doi={10.21437/Interspeech.2018-1905},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1905}
}