10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

A Noise-Type and Level-Dependent MPO-Based Speech Enhancement Architecture with Variable Frame Analysis for Noise-Robust Speech Recognition

Vikramjit Mitra (1), Bengt J. Borgstrom (2), Carol Y. Espy-Wilson (1), Abeer Alwan (2)

(1) University of Maryland at College Park, USA
(2) University of California at Los Angeles, USA

In previous work, a speech enhancement algorithm based on phase opponency and a periodicity measure (MPO-APP) was developed for speech recognition. Axiomatic thresholds were used in the MPO-APP regardless of the signal-to-noise ratio (SNR) of the corrupted speech or any characterization of the noise. The current work developed an algorithm for adjusting the threshold in the MPO-APP based on the SNR and whether the speech signal is clean, corrupted by aperiodic noise or corrupted with noise with periodic components. In addition, variable frame rate (VFR) analysis has been incorporated so that dynamic regions in the speech signal are more heavily sampled than steady-state regions. The result is a 2-stage algorithm that gives superior performance to the previous MPO-APP, and to several other state-of-the-art speech enhancement algorithms.

Full Paper

Bibliographic reference.  Mitra, Vikramjit / Borgstrom, Bengt J. / Espy-Wilson, Carol Y. / Alwan, Abeer (2009): "A noise-type and level-dependent MPO-based speech enhancement architecture with variable frame analysis for noise-robust speech recognition", In INTERSPEECH-2009, 2751-2754.