ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Classifier-based mask estimation for missing feature methods of robust speech recognition

Michael L. Seltzer, Bhiksha Raj, Richard M. Stern

Missing feature methods of noise compensation for speech recognition operate by removing components of a spectrographic representation of speech that are considered to be corrupt, as indicated by a low signal-to-noise ratio. Recognition is either performed directly on the incomplete spectrograms or the missing components are reconstructed prior to recognition. These methods require a spectrographic mask which accurately labels the reliable and corrupt regions of the spectrogram. Current methods of mask estimation rely on assumptions about the corrupting noise such as stationarity. This is a significant drawback since the missing feature methods themselves have no such restrictions. We present a new mask estimation technique that uses a Bayesian classifier to determine the reliability of spectrographic elements. Features were designed that make no assumptions about the corrupting noise signal, but rather exploit characteristics of the speech signal itself. Missing feature compensation experiments were performed on speech corrupted by a variety of noises. In all cases, classifier-based mask estimation resulted in significantly better recognition accuracy than conventional mask estimation methods.


Cite as: Seltzer, M.L., Raj, B., Stern, R.M. (2000) Classifier-based mask estimation for missing feature methods of robust speech recognition. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 538-541

@inproceedings{seltzer00_icslp,
  author={Michael L. Seltzer and Bhiksha Raj and Richard M. Stern},
  title={{Classifier-based mask estimation for missing feature methods of robust speech recognition}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 3, 538-541}
}