Unmixing Convolutive Mixtures by Exploiting Amplitude Co-Modulation: Methods and Evaluation on Mandarin Speech Recordings

Bo-Rui Chen, Huang-Yi Lee, Yi-Wen Liu


This paper presents and evaluates two frequency-domain methods for multi-channel sound source separation. The sources are assumed to couple to the microphones with unknown room responses. Independent component analysis (ICA) is applied in the frequency domain to obtain maximally independent amplitude envelopes (AEs) at every frequency. Due to the nature of ICA, the AEs across frequencies need to be de-permuted. To this end, we seek to assign AEs to the same source solely based on the correlation in their magnitude variation against time. The resulted time-varying spectra are inverse Fourier transformed to synthesize separated signals. Objective evaluation showed that both methods achieve a signal-to-interference ratio (SIR) that is comparable to Mazur et al (2013). In addition, we created spoken Mandarin materials and recruited age-matched subjects to perform word-by-word transcription. Results showed that, first, speech intelligibility significantly improved after unmixing. Secondly, while both methods achieved similar SIR, the subjects preferred to listen to the results that were post-processed to ensure a speech-like spectral shape; the mean opinion scores were 2.9 vs. 4.3 (out of 5) between the two methods. The present results may provide suggestions regarding deployment of the correlation-based source separation algorithms into devices with limited computational resources.


 DOI: 10.21437/Interspeech.2017-1227

Cite as: Chen, B., Lee, H., Liu, Y. (2017) Unmixing Convolutive Mixtures by Exploiting Amplitude Co-Modulation: Methods and Evaluation on Mandarin Speech Recordings. Proc. Interspeech 2017, 1934-1937, DOI: 10.21437/Interspeech.2017-1227.


@inproceedings{Chen2017,
  author={Bo-Rui Chen and Huang-Yi Lee and Yi-Wen Liu},
  title={Unmixing Convolutive Mixtures by Exploiting Amplitude Co-Modulation: Methods and Evaluation on Mandarin Speech Recordings},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={1934--1937},
  doi={10.21437/Interspeech.2017-1227},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1227}
}