In this paper, a novel technique based on the empirical mode decomposition (EMD) methodology is proposed and examined for the noise-robustness of automatic speech recognition systems. The EMD analysis is a generalization of the Fourier analysis for processing non-linear and non-stationary time functions, in our case, the speech feature sequences. We use the first and second intrinsic mode functions (IMF), which include the sinusoidal functions as special cases, obtained from the EMD analysis in the post-processing of the log energy feature. Experimental results on the noisy-digit Aurora 2.0 database show that our proposed method leads to significant improvement for the mismatched (clean-training) tasks.
Bibliographic reference. Wu, Kuo-Hao / Chen, Chia-Ping (2010): "Empirical mode decomposition for noise-robust automatic speech recognition", In INTERSPEECH-2010, 2074-2077.