We present a new approach for separating two speech signals when only a single recording of their additive mixture is available. In this approach, log spectra of the sources are estimated using maximum a posteriori estimation given the mixture's log spectrum and the probability density functions of the sources. It is shown that the estimation leads to a two-state, non-linear filter whose states are controlled by the means of the sources. The first state of the filter is expressed using a combination of two Wiener filters whose parameters are controlled by the means and variances of the sources and noise variance and the second state is expressed by the means of the sources. Through the experiments, conducted on a wide variety of mixtures, we show that the MAP based estimator outperforms the methods which use binary mask filtering or Wiener filtering for the separation task.
Bibliographic reference. Radfar, M. H. / Dansereau, R. M. (2007): "Single channel speech separation using maximum a posteriori estimation", In INTERSPEECH-2007, 958-961.