13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Acoustic Features for Classification Based Speech Separation

Yuxuan Wang (1), Kun Han (1), DeLiang Wang (1,2)

(1) Department of Computer Science and Engineering; (2) Center for Cognitive Science;
The Ohio State University, USA

Speech separation can be effectively formulated as a binary classification problem. A classification based system produces a binary mask using acoustic features in each time-frequency unit. So far, only pitch and amplitude modulation spectrogram have been used as unit level features. In this paper, we study other acoustic features and show that they can significantly improve both voiced and unvoiced speech separation performance. To further explore complementarity in terms of discriminative power, we propose a group Lasso approach for feature combination. The final combined feature set yields promising results in both matched and unmatched test conditions.

Index Terms: Speech separation, binary classification, feature combination, group Lasso

Full Paper

Bibliographic reference.  Wang, Yuxuan / Han, Kun / Wang, DeLiang (2012): "Acoustic features for classification based speech separation", In INTERSPEECH-2012, 1532-1535.