Estimation of the Probability Distribution of Spectral Fine Structure in the Speech Source

Tom Bäckström


The efficiency of many speech processing methods rely on accurate modeling of the distribution of the signal spectrum and a majority of prior works suggest that the spectral components follow the Laplace distribution. To improve the probability distribution models based on our knowledge of speech source modeling, we argue that the model should in fact be a multiplicative mixture model, including terms for voiced and unvoiced utterances. While prior works have applied Gaussian mixture models, we demonstrate that a mixture of generalized Gaussian models more accurately follows the observations. The proposed estimation method is based on measuring the ratio of Lp-norms between spectral bands. Such ratios follow the Beta-distribution when the input signal is generalized Gaussian, whereby the estimated parameters can be used to determine the underlying parameters of the mixture of generalized Gaussian distributions.


 DOI: 10.21437/Interspeech.2017-389

Cite as: Bäckström, T. (2017) Estimation of the Probability Distribution of Spectral Fine Structure in the Speech Source. Proc. Interspeech 2017, 344-348, DOI: 10.21437/Interspeech.2017-389.


@inproceedings{Bäckström2017,
  author={Tom Bäckström},
  title={Estimation of the Probability Distribution of Spectral Fine Structure in the Speech Source},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={344--348},
  doi={10.21437/Interspeech.2017-389},
  url={http://dx.doi.org/10.21437/Interspeech.2017-389}
}