Sixth European Conference on Speech Communication and Technology

We have investigated a novel method of spectral estimation based on mixture of Gaussians in a sinusoidal analysis and synthesis framework. After quantisation of this parametric scheme a fixed framerate coder operating at a bitrate of around 2.4 kbits/s has been developed. This paper describes an extension to this spectral model based on constraining the parameters of the mixture of Gaussians to be on a polynomial trajectory over a segment of speech data. This is referred to as the mixture of Gaussians polynomial model (MGPM). In order to realise a segmental coder, dynamic programming over the utterance is performed. The segmental representation of the spectra results in a loglikelihood score over a segment which is used as the cost function in the dynamic programming algorithm. Speech coding components such aspitch, voicing and gain are described segmentally. A number of segmental coders are presented with bitrates in the range of 350 to 650 bits/s. These coders offer good and intelligible coded speech evaluated using DRT scoring at these bitrates.
