Variational Bayesian Multi-Channel Speech Dereverberation Under Noisy Environments with Probabilistic Convolutive Transfer Function

Masahito Togami, Tatsuya Komatsu


In this paper, we propose a multi-channel speech dereverberation method which can reduce reverberation even when acoustic transfer functions (ATFs) are time varying under noisy environments. The microphone input signal is modeled as a convolutive mixture in a time-frequency domain so as to incorporate late reverberation whose tap length is longer than frame size of short term Fourier transform. To reduce reverberation effectively under the time-varying ATF conditions, the proposed method extends the deterministic convolutive transfer function (D-CTF) into a probabilistic convolutive transfer function (P-CTF). A variational Bayesian framework was applied to approximation of a joint posterior probability density functions of a speech source signal and the ATFs. Variational posterior probability density functions and the other parameters are iteratively updated so as to maximize an evidence lower bound (ELBO). Experimental results when the ATFs are time-varying and there is background noise showed that the proposed method can reduce reverberation more accurately than the Weighted Prediction error (WPE) and the Kalman-EM for dereverberation (KEMD).


 DOI: 10.21437/Interspeech.2019-1220

Cite as: Togami, M., Komatsu, T. (2019) Variational Bayesian Multi-Channel Speech Dereverberation Under Noisy Environments with Probabilistic Convolutive Transfer Function. Proc. Interspeech 2019, 106-110, DOI: 10.21437/Interspeech.2019-1220.


@inproceedings{Togami2019,
  author={Masahito Togami and Tatsuya Komatsu},
  title={{Variational Bayesian Multi-Channel Speech Dereverberation Under Noisy Environments with Probabilistic Convolutive Transfer Function}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={106--110},
  doi={10.21437/Interspeech.2019-1220},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1220}
}