On the Importance of Audio-Source Separation for Singer Identification in Polyphonic Music

Bidisha Sharma, Rohan Kumar Das, Haizhou Li


Singer identification is to automatically identify the singer in a music recording, such as a polyphonic song. A song has two major acoustic components that are singing vocals and background accompaniment. Although identifying singers is similar to speaker identification, it is challenging due to the interference of background accompaniment on the singer-specific information in singing vocals. We believe that separating the background accompaniment from the singing vocal will help us to overcome the interference. In this work, we extract the singing vocals from polyphonic songs using Wave-U-Net based audio-source separation approach. The extracted singing vocals are then used in i-vector based singer identification system. Further, we explore different state-of-the-art audio-source separation methods to establish the role of considered method in application to singer identification. The proposed singer identification framework achieves an absolute accuracy improvement of 5.66% over the baseline without audio-source separation.


 DOI: 10.21437/Interspeech.2019-1925

Cite as: Sharma, B., Das, R.K., Li, H. (2019) On the Importance of Audio-Source Separation for Singer Identification in Polyphonic Music. Proc. Interspeech 2019, 2020-2024, DOI: 10.21437/Interspeech.2019-1925.


@inproceedings{Sharma2019,
  author={Bidisha Sharma and Rohan Kumar Das and Haizhou Li},
  title={{On the Importance of Audio-Source Separation for Singer Identification in Polyphonic Music}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={2020--2024},
  doi={10.21437/Interspeech.2019-1925},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1925}
}