12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

GMM-Based Missing-Feature Reconstruction on Multi-Frame Windows

Ulpu Remes (1), Yoshihiko Nankaku (2), Keiichi Tokuda (2)

(1) Aalto University, Finland
(2) Nagoya Institute of Technology, Japan

Methods for missing-feature reconstruction substitute noisecorrupted features with clean-speech estimates calculated based on reliable information found in the noisy speech signal. Gaussian mixture model (GMM) based reconstruction has conventionally focussed on reliable information present in a single frame. In this work, GMM-based reconstruction is applied on windows that span several time frames. Mixtures of factor analysers (MFA) are used to limit the number of model parameters needed to describe the feature distribution as window width increases. Using the window-based MFA in noisy speech recognition task resulted in relative error reductions up to 52% compared to frame-based GMM.

Full Paper

Bibliographic reference.  Remes, Ulpu / Nankaku, Yoshihiko / Tokuda, Keiichi (2011): "GMM-based missing-feature reconstruction on multi-frame windows", In INTERSPEECH-2011, 1665-1668.