Interspeech'2005 - Eurospeech
In this paper, we propose an effective mask-estimation method for missing-feature reconstruction in order to achieve robust speech recognition in unknown noise environments. In previous work, it was found that training a model for mask estimation on speech corrupted by white noise did not provide environment-independent recognition accuracy. In this paper we describe a training method based on bands of colored noise that is more effective in reflecting spectral variations across neighboring frames and subbands. We also achieved further improvement in recognition accuracy by reconsidering frames that appeared to be unvoiced in the initial pitch analysis. Performance is evaluated using the Aurora 2.0 database in the presence of various types of noise maskers. Experimental results indicate that the proposed methods are effective in estimating masks for missing-feature reconstruction while remaining more independent of the noise conditions.
Bibliographic reference. Kim, Wooil / Stern, Richard M. / Ko, Hanseok (2005): "Environment-independent mask estimation for missing-feature reconstruction", In INTERSPEECH-2005, 2637-2640.