ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Supergaussian GARCH models for speech signals

Israel Cohen

In this paper, we introduce supergaussian generalized autoregressive conditional heteroscedasticity (GARCH) models for speech signals in the short-time Fourier transform (STFT) domain. We address the problem of speech enhancement, and show that estimating the variances of the STFT expansion coefficients based on GARCH models yields higher speech quality than by using the decision-directed method, whether the fidelity criterion is minimum mean-squared error (MMSE) of the spectral coefficients or MMSE of the log-spectral amplitude (LSA). Furthermore, while a Gaussian model is inferior to Gamma and Laplacian models when estimating the variances by the decision-directed method, a Gaussian model is superior when using the GARCH modeling method. This facilitates MMSE-LSA estimation, while taking into consideration the heavy-tailed distribution.

doi: 10.21437/Interspeech.2005-673

Cite as: Cohen, I. (2005) Supergaussian GARCH models for speech signals. Proc. Interspeech 2005, 2053-2056, doi: 10.21437/Interspeech.2005-673

  author={Israel Cohen},
  title={{Supergaussian GARCH models for speech signals}},
  booktitle={Proc. Interspeech 2005},