INTERSPEECH 2013
14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Audio Classification Using Dominant Spatial Patterns in Time-Frequency Space

Md. Khademul Islam Molla, Keikichi Hirose

University of Tokyo, Japan

This paper presents a novel audio discrimination algorithm using spatial features in time-frequency (TF) space. Three types of audio signals . speech, music without vocal and music with background vocal are taken into consideration for classification. The audio segment is transformed into TF domain yielding the spatial illustration of energy. Non-negative matrix factorization (NMF) is applied to TF space to extract a set of vectors which represents the dominant subspace of spatial energy distribution. The inverse Fourier transform is applied to individual dominant vectors to derive the features for audio discrimination. The classification is performed by using multiclass linear discriminant analysis (mcLDA). The experimental results show that the proposed algorithm is more noise robust and performs better than the recently reported methods.

Full Paper

Bibliographic reference.  Molla, Md. Khademul Islam / Hirose, Keikichi (2013): "Audio classification using dominant spatial patterns in time-frequency space", In INTERSPEECH-2013, 2915-2919.