ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Speech coding using mixture of gaussians polynomial model

Parham Zolfaghari, Tony Robinson

We have investigated a novel method of spectral estimation based on mixture of Gaussians in a sinusoidal analysis and synthesis framework. After quantisation of this parametric scheme a fixed frame-rate coder operating at a bit-rate of around 2.4 kbits/s has been developed. This paper describes an extension to this spectral model based on constraining the parameters of the mixture of Gaussians to be on a polynomial trajectory over a segment of speech data. This is referred to as the mixture of Gaussians polynomial model (MGPM). In order to realise a segmental coder, dynamic programming over the utterance is performed. The segmental representation of the spectra results in a log-likelihood score over a segment which is used as the cost function in the dynamic programming algorithm. Speech coding components such aspitch, voicing and gain are described segmentally. A number of segmental coders are presented with bit-rates in the range of 350 to 650 bits/s. These coders offer good and intelligible coded speech evaluated using DRT scoring at these bit-rates.


doi: 10.21437/Eurospeech.1999-340

Cite as: Zolfaghari, P., Robinson, T. (1999) Speech coding using mixture of gaussians polynomial model. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 1495-1498, doi: 10.21437/Eurospeech.1999-340

@inproceedings{zolfaghari99_eurospeech,
  author={Parham Zolfaghari and Tony Robinson},
  title={{Speech coding using mixture of gaussians polynomial model}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={1495--1498},
  doi={10.21437/Eurospeech.1999-340}
}