ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Spectro-temporal modulation based singing detection combined with pitch-based grouping for singing voice separation

Tse-En Lin, Chung-Chien Hsu, Yi-Cheng Chen, Jian-Hueng Chen, Tai-Shih Chi

A spectro-temporal modulation based singing voice detection cascaded with a Viterbi based pitch tracking algorithm is proposed in this paper for singing-voice separation from monaural recordings. To detect the singing voice, the spectro-temporal modulation energy related to voice harmonics is extracted using a spectro-temporal modulation analysis framework developed for the Fourier spectrogram. Separation of singing-voice from background music is conducted using a binary mask to group estimated harmonics of singing voice. The proposed system is evaluated using MIR-1K dataset and is shown outperforming three other binary-mask based systems in the vocal/music separation task.


doi: 10.21437/Interspeech.2013-652

Cite as: Lin, T.-E., Hsu, C.-C., Chen, Y.-C., Chen, J.-H., Chi, T.-S. (2013) Spectro-temporal modulation based singing detection combined with pitch-based grouping for singing voice separation. Proc. Interspeech 2013, 2920-2923, doi: 10.21437/Interspeech.2013-652

@inproceedings{lin13b_interspeech,
  author={Tse-En Lin and Chung-Chien Hsu and Yi-Cheng Chen and Jian-Hueng Chen and Tai-Shih Chi},
  title={{Spectro-temporal modulation based singing detection combined with pitch-based grouping for singing voice separation}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={2920--2923},
  doi={10.21437/Interspeech.2013-652}
}