12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Deep Belief Networks for Automatic Music Genre Classification

Xiaohong Yang, Qingcai Chen, Shusen Zhou, Xiaolong Wang

Harbin Institute of Technology, China

This paper proposes an approach to automatic music genre classification using deep belief networks. Based on the restricted Boltzmann machines, the deep belief networks is constructed and takes the acoustic features extracted through content-based analysis of music signals as input. The model parameters are initially determined after the deep belief network is trained by greedy layer-wise learning algorithm with feature vectors that are comprised of short-term and long-term features. Then the parameters are fine-tuned to local optimum according to back propagation algorithm. Experiments on GTZAN dataset show that the performance of music genre classification using deep belief networks is superior to those of widely used classification methods such as support vector machine, K-nearest neighbor, linear discriminant analysis and neural network.

Full Paper

Bibliographic reference.  Yang, Xiaohong / Chen, Qingcai / Zhou, Shusen / Wang, Xiaolong (2011): "Deep belief networks for automatic music genre classification", In INTERSPEECH-2011, 2433-2436.