ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

A study of mutual front-end processing method based on statistical model for noise robust speech recognition

Masakiyo Fujimoto, Kentaro Ishizuka, Tomohiro Nakatani

This paper addresses robust front-end processing for automatic speech recognition (ASR) in noise. Accurate recognition of corrupted speech requires noise robust front-end processing, e.g., voice activity detection (VAD) and noise suppression (NS). Typically, VAD and NS are combined as one-way processing, and are developed independently. However, VAD and NS should not be assumed to be independent techniques, because sharing each othersÂ’ information is important for the improvement of front-end processing. Thus, we investigate the mutual front-end processing by integrating VAD and NS, which can beneficially share each othersÂ’ information. In an evaluation of a concatenated speech corpus, CENSREC-1-C database, the proposed method improves the performance of both VAD and ASR compared with the conventional method.


doi: 10.21437/Interspeech.2009-356

Cite as: Fujimoto, M., Ishizuka, K., Nakatani, T. (2009) A study of mutual front-end processing method based on statistical model for noise robust speech recognition. Proc. Interspeech 2009, 1235-1238, doi: 10.21437/Interspeech.2009-356

@inproceedings{fujimoto09_interspeech,
  author={Masakiyo Fujimoto and Kentaro Ishizuka and Tomohiro Nakatani},
  title={{A study of mutual front-end processing method based on statistical model for noise robust speech recognition}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1235--1238},
  doi={10.21437/Interspeech.2009-356}
}