A Deep Neural Network Based Harmonic Noise Model for Speech Enhancement

Zhiheng Ouyang, Hongjiang Yu, Wei-Ping Zhu, Benoit Champagne

In this paper, we present a novel deep neural network (DNN) based speech enhancement method that uses a harmonic noise model (HNM) to estimate the clean speech. By utilizing HNM to model the clean speech in the short-time Fourier transform domain and extracting some time-frequency features of noisy speech for the DNN training, the new method predicts the harmonic and residual amplitudes of clean speech from a set of noisy speech features. In order to emphasize the importance of the harmonic component and reduce the effect caused by the residual, a scaling factor is also introduced and applied to the residual amplitude. The enhanced speech is reconstructed with the estimated clean speech amplitude and the noisy phase of HNM. Experimental results demonstrate that our proposed HNM-DNN method outperforms two existing DNN based speech enhancement methods in terms of both speech quality and intelligibility.

 DOI: 10.21437/Interspeech.2018-1114

Cite as: Ouyang, Z., Yu, H., Zhu, W., Champagne, B. (2018) A Deep Neural Network Based Harmonic Noise Model for Speech Enhancement. Proc. Interspeech 2018, 3224-3228, DOI: 10.21437/Interspeech.2018-1114.

  author={Zhiheng Ouyang and Hongjiang Yu and Wei-Ping Zhu and Benoit Champagne},
  title={A Deep Neural Network Based Harmonic Noise Model for Speech Enhancement},
  booktitle={Proc. Interspeech 2018},