INTERSPEECH 2004 - ICSLP
In this paper we present two techniques to cover the gap between the true and the estimated clean speech features in the context of Model-Based Feature Enhancement (MBFE) for noise robust speech recognition. While in the output of every feature enhancement algorithm some residual uncertainty remains, currently this information is mostly discarded. Firstly, we explain how the generation of not only a global MMSE-estimate of clean speech, but also several alternative (state-conditional) estimates are supplied to the back-end for recognition. Secondly, we explore the benefits of calculating the variance of the front-end estimate and incorporating this in the acoustic models of the recogniser. Experiments on the Aurora2 task confirmed the superior performance of the resulting system: an average increase in recognition accuracy from 85.65% to 88.50% was obtained for the clean training condition.
Bibliographic reference. hamme, Hugo Van / Wambacq, Patrick / Stouten, Veronique (2004): "Accounting for the uncertainty of speech estimates in the context of model-based feature enhancement", In INTERSPEECH-2004, 105-108.