ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

A spectrogram model for enhanced source localization and noise-robust ASR

Guillaume Lathoud, Mathew Magimai-Doss, Bertrand Mesot

This paper proposes a simple, computationally efficient 2-mixture model approach to discrimination between speech and background noise. It is directly derived from observations on real data, and can be used in a fully unsupervised manner, with the EM algorithm. A first application to sector-based, joint audio source localization and detection, using multiple microphones, confirms that the model can provide major enhancement. A second application to the single channel speech recognition task in a noisy environment yields major improvement on stationary noise and promising results on non-stationary noise.


doi: 10.21437/Interspeech.2005-747

Cite as: Lathoud, G., Magimai-Doss, M., Mesot, B. (2005) A spectrogram model for enhanced source localization and noise-robust ASR. Proc. Interspeech 2005, 2345-2348, doi: 10.21437/Interspeech.2005-747

@inproceedings{lathoud05_interspeech,
  author={Guillaume Lathoud and Mathew Magimai-Doss and Bertrand Mesot},
  title={{A spectrogram model for enhanced source localization and noise-robust ASR}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={2345--2348},
  doi={10.21437/Interspeech.2005-747}
}