Speech separation can be effectively formulated as a binary classification problem. A classification based system produces a binary mask using acoustic features in each time-frequency unit. So far, only pitch and amplitude modulation spectrogram have been used as unit level features. In this paper, we study other acoustic features and show that they can significantly improve both voiced and unvoiced speech separation performance. To further explore complementarity in terms of discriminative power, we propose a group Lasso approach for feature combination. The final combined feature set yields promising results in both matched and unmatched test conditions.
Index Terms: Speech separation, binary classification, feature combination, group Lasso
Bibliographic reference. Wang, Yuxuan / Han, Kun / Wang, DeLiang (2012): "Acoustic features for classification based speech separation", In INTERSPEECH-2012, 1532-1535.