A DNN-HMM Approach to Non-Negative Matrix Factorization Based Speech Enhancement

Ziteng Wang, Xu Li, Xiaofei Wang, Qiang Fu, Yonghong Yan


General speaker-independent models have been used in non-negative matrix factorization (NMF) based speech enhancement algorithms for the practical applicability. And additional regulation is necessary when choosing the optimal models for speech reconstruction. In this paper, we propose a novel utilization of deep neural network (DNN) to select the models used for separating speech from noise. Specifically, multiple local dictionaries are learned, whereas only one is activated for each block in the separation step. Besides, the temporal dependencies between blocks are represented by hidden Markov model (HMM), with which it turns out a hybrid DNN-HMM framework. The most probable activation sequence is then solved by the Viterbi algorithm. Experimental evaluations which focus on a speech denoising application are carried out. The results confirm that our proposed approach achieves better performance when compared with some existing methods.


DOI: 10.21437/Interspeech.2016-147

Cite as

Wang, Z., Li, X., Wang, X., Fu, Q., Yan, Y. (2016) A DNN-HMM Approach to Non-Negative Matrix Factorization Based Speech Enhancement. Proc. Interspeech 2016, 3763-3767.

Bibtex
@inproceedings{Wang+2016,
author={Ziteng Wang and Xu Li and Xiaofei Wang and Qiang Fu and Yonghong Yan},
title={A DNN-HMM Approach to Non-Negative Matrix Factorization Based Speech Enhancement},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-147},
url={http://dx.doi.org/10.21437/Interspeech.2016-147},
pages={3763--3767}
}