Multilingual i-Vector Based Statistical Modeling for Music Genre Classification

Jia Dai, Wei Xue, Wenju Liu


For music signal processing, compared with the strategy which models each short-time frame independently, when the long-time features are considered, the time-series characteristics of the music signal can be better presented. As a typical kind of long-time modeling strategy, the identification vector (i-vector) uses statistical modeling to model the audio signal in the segment level. It can better capture the important elements of the music signal, and these important elements may benefit to the classification of music signal. In this paper, the i-vector based statistical feature for music genre classification is explored. In addition to learn enough important elements for music signal, a new multilingual i-vector feature is proposed based on the multilingual model. The experimental results show that the multilingual i-vector based models can achieve better classification performances than conventional short-time modeling based methods.


 DOI: 10.21437/Interspeech.2017-74

Cite as: Dai, J., Xue, W., Liu, W. (2017) Multilingual i-Vector Based Statistical Modeling for Music Genre Classification. Proc. Interspeech 2017, 459-463, DOI: 10.21437/Interspeech.2017-74.


@inproceedings{Dai2017,
  author={Jia Dai and Wei Xue and Wenju Liu},
  title={Multilingual i-Vector Based Statistical Modeling for Music Genre Classification},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={459--463},
  doi={10.21437/Interspeech.2017-74},
  url={http://dx.doi.org/10.21437/Interspeech.2017-74}
}