ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

A single channel speech enhancement approach by combining statistical criterion and multi-frame sparse dictionary learning

Hung-Wei Tseng, Srikanth Vishnubhotla, Mingyi Hong, Xiangfeng Wang, Jinjun Xiao, Zhi-Quan Luo, Tao Zhang

In this paper, we consider the single-channel speech enhancement problem, in which a clean speech signal needs to be estimated from a noisy observation. To capture the characteristics of both the noise and speech signals, we combine the well-known Short-Time- Spectrum-Amplitude (STSA) estimator with a machine learning based technique called Multi-frame Sparse Dictionary Learning (MSDL). The former utilizes statistical information for denoising, while the latter helps better preserve speech, especially its temporal structure. The proposed algorithm, named STSA-MSDL, outperforms standard statistical algorithms such as the Wiener filter, STSA estimator, as well as dictionary based algorithms when applied to the TIMIT database, using four different objective metrics that measure speech intelligibility, speech distortion, background noise reduction, and the overall quality.


doi: 10.21437/Interspeech.2013-133

Cite as: Tseng, H.-W., Vishnubhotla, S., Hong, M., Wang, X., Xiao, J., Luo, Z.-Q., Zhang, T. (2013) A single channel speech enhancement approach by combining statistical criterion and multi-frame sparse dictionary learning. Proc. Interspeech 2013, 451-455, doi: 10.21437/Interspeech.2013-133

@inproceedings{tseng13_interspeech,
  author={Hung-Wei Tseng and Srikanth Vishnubhotla and Mingyi Hong and Xiangfeng Wang and Jinjun Xiao and Zhi-Quan Luo and Tao Zhang},
  title={{A single channel speech enhancement approach by combining statistical criterion and multi-frame sparse dictionary learning}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={451--455},
  doi={10.21437/Interspeech.2013-133}
}