ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Audio classification using dominant spatial patterns in time-frequency space

Md. Khademul Islam Molla, Keikichi Hirose

This paper presents a novel audio discrimination algorithm using spatial features in time-frequency (TF) space. Three types of audio signals . speech, music without vocal and music with background vocal are taken into consideration for classification. The audio segment is transformed into TF domain yielding the spatial illustration of energy. Non-negative matrix factorization (NMF) is applied to TF space to extract a set of vectors which represents the dominant subspace of spatial energy distribution. The inverse Fourier transform is applied to individual dominant vectors to derive the features for audio discrimination. The classification is performed by using multiclass linear discriminant analysis (mcLDA). The experimental results show that the proposed algorithm is more noise robust and performs better than the recently reported methods.


doi: 10.21437/Interspeech.2013-651

Cite as: Molla, M.K.I., Hirose, K. (2013) Audio classification using dominant spatial patterns in time-frequency space. Proc. Interspeech 2013, 2915-2919, doi: 10.21437/Interspeech.2013-651

@inproceedings{molla13_interspeech,
  author={Md. Khademul Islam Molla and Keikichi Hirose},
  title={{Audio classification using dominant spatial patterns in time-frequency space}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={2915--2919},
  doi={10.21437/Interspeech.2013-651}
}