15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

A Data-Driven Approach to Speech Enhancement Using Gaussian Process

Sukanya Sonowal (1), Kisoo Kwon (1), Nam Soo Kim (1), Jong Won Shin (2)

(1) Seoul National University, Korea
(2) GIST, Korea

This paper presents a novel data-driven approach to single channel speech enhancement employing Gaussian process (GP). Our approach is based on applying GP regression to estimate the residual gain with the input features being the a priori and a posteriori signal-to-noise ratios (SNRs). The residual gain is defined as the difference between the optimal gain and that obtained from the minimum mean-square error log-spectral amplitude (MMSE-LSA) estimator. Our proposed approach involves a cascaded structure consisting of two stages. At the first stage, the gain of the MMSE-LSA estimator is calculated in conjunction with the SNR features. In the second stage, the residual gains are estimated through GP and they are used to further enhance the output of the MMSE-LSA module. Experimental results show that the proposed approach produced better speech quality than not only the MMSE-LSA enhancement module but also the other data-driven technique.

Full Paper

Bibliographic reference.  Sonowal, Sukanya / Kwon, Kisoo / Kim, Nam Soo / Shin, Jong Won (2014): "A data-driven approach to speech enhancement using Gaussian process", In INTERSPEECH-2014, 2847-2851.