This paper demonstrates the importance of accurate characterization of instantaneous acoustic noise for mask estimation in data imputation approaches to missing feature based ASR, especially in the presence of non-stationary background noise. Mask estimation relies on a hypothesis test designed to detect the presence of speech in time-frequency spectral bins under rapidly varying noise conditions. Masked mel-frequency filter bank energies are reconstructed using a MMSE based data imputation procedure. The impact of this mask estimation approach is evaluated in the context of MMSE based data imputation under multiple background conditions over a range of SNRs using the Aurora 2 speech corpus.
Bibliographic reference. Badiezadegan, Shirin / Rose, Richard C. (2010): "Mask estimation in non-stationary noise environments for missing feature based robust speech recognition", In INTERSPEECH-2010, 2062-2065.