ISCA Archive SAPA 2004
ISCA Archive SAPA 2004

Soft mask estimation for single channel speaker separation

Aarthi M. Reddy, Bhiksha Raj

The problem of single channel speaker separation, attempts to extract a speech signal uttered by the speaker of interest from a signal containing a mixture of auditory signals. Most algorithms that deal with this problem, are based on masking, where reliable components from the mixed signal spectrogram are inversed to obtain the speech signal from speaker of interest. As of now, most techniques, estimate this mask in a binary fashion, resulting in a hard mask. We present a technique to estimate a soft mask that weights the frequency sub-bands of the mixed signal. The speech signal can then be reconstructed from the estimated power spectrum of the speaker of interest. Experimental results shown in this paper, prove that the results are better than those obtained by estimating the hard mask.


Cite as: Reddy, A.M., Raj, B. (2004) Soft mask estimation for single channel speaker separation. Proc. ITRW on Statistical and Perceptual Audio Processing (SAPA 2004), paper 158

@inproceedings{reddy04_sapa,
  author={Aarthi M. Reddy and Bhiksha Raj},
  title={{Soft mask estimation for single channel speaker separation}},
  year=2004,
  booktitle={Proc. ITRW on Statistical and Perceptual Audio Processing (SAPA 2004)},
  pages={paper 158}
}