Speech Enhancement Using the Minimum-probability-of-error Criterion

Jishnu Sadasivan, Subhadip Mukherjee, Chandra Sekhar Seelamantula


We propose a novel speech denoising framework by minimizing the probability of error (PE), which measures the deviation probability of the estimate from its true value. To develop the minimum PE (MPE) criterion, one requires the knowledge of the noise probability density function (p.d.f.), which may not be available in a parametric form in speech denoising applications. Therefore, we adopt two approaches for modeling the noise p.d.f.: (i) Gaussian modeling based on adaptive variance estimation; and (ii) a Gaussian mixture model (GMM) in view of its approximation capabilities. We consider discrete cosine transform (DCT) domain shrinkage, where the optimum shrinkage parameter is obtained by minimizing an estimate of the PE. A performance assessment for real-world noise types shows that for input signal-to-noise ratios (SNR) greater than 5 dB, the proposed MPE-based point-wise shrinkage estimators outperform three benchmark techniques in terms of segmental SNR and short-time objective intelligibility (STOI) scores.


 DOI: 10.21437/Interspeech.2018-1294

Cite as: Sadasivan, J., Mukherjee, S., Seelamantula, C.S. (2018) Speech Enhancement Using the Minimum-probability-of-error Criterion. Proc. Interspeech 2018, 1141-1145, DOI: 10.21437/Interspeech.2018-1294.


@inproceedings{Sadasivan2018,
  author={Jishnu Sadasivan and Subhadip Mukherjee and Chandra Sekhar Seelamantula},
  title={Speech Enhancement Using the Minimum-probability-of-error Criterion},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={1141--1145},
  doi={10.21437/Interspeech.2018-1294},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1294}
}