Improved Music Genre Classification with Convolutional Neural Networks

Weibin Zhang, Wenkang Lei, Xiangmin Xu, Xiaofeng Xing


In recent years, deep neural networks have been shown to be effective in many classification tasks, including music genre classification. In this paper, we proposed two ways to improve music genre classification with convolutional neural networks: 1) combining max- and average-pooling to provide more statistical information to higher level neural networks; 2) using shortcut connections to skip one or more layers, a method inspired by residual learning method. The input of the CNN is simply the short time Fourier transforms of the audio signal. The output of the CNN is fed into another deep neural network to do classification. By comparing two different network topologies, our preliminary experimental results on the GTZAN data set show that the above two methods can effectively improve the classification accuracy, especially the second one.


DOI: 10.21437/Interspeech.2016-1236

Cite as

Zhang, W., Lei, W., Xu, X., Xing, X. (2016) Improved Music Genre Classification with Convolutional Neural Networks. Proc. Interspeech 2016, 3304-3308.

Bibtex
@inproceedings{Zhang+2016,
author={Weibin Zhang and Wenkang Lei and Xiangmin Xu and Xiaofeng Xing},
title={Improved Music Genre Classification with Convolutional Neural Networks},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1236},
url={http://dx.doi.org/10.21437/Interspeech.2016-1236},
pages={3304--3308}
}