Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Enhancement of Mel Log-Power Spectrum of Speech Using Particle Filtering

Ilyas Potamitis (1), Nikolaos Fakotakis (2)

(1) Technological Educational Institute of Crete, Greece; (2) University of Patras, Greece

The subject of this work is a statistical feature enhancement technique for robust speech recognition applied to the log-power domain after the application of the Mel filterbank. The proposed approach makes use of a state space formulation that involves a random walk model for the evolution of the underlying clean features and a non-linear observation model that connects the noisy features with noise and clean speech. The novelty of the proposed approach is that a) both observation and state noise are shown to be heavy-tailed and are subsequently modelled using a mixture of Gaussians, b) a sequential Monte Carlo filter is employed to approximate the posterior probability of clean speech thus avoiding linearization of the non-linear observation model as in the case of algorithms that perform iterative approximations. The efficiency of the approach is illustrated when additive white Gaussian (AWGN) or babble noise is present in low signal-to-noise ratios (SNR).

Full Paper

Bibliographic reference.  Potamitis, Ilyas / Fakotakis, Nikolaos (2005): "Enhancement of mel log-power spectrum of speech using particle filtering", In INTERSPEECH-2005, 917-920.