September 22-25, 1997
This paper describes a low bit-rate segmental formant vocoder. The formants are estimated using mixture of Gaussians whose means are constrained to vary linearly with time within a segment. A new method of smoothing the power spectrum has been used in order to improve modelling with mixtures of Gaussians. Pitch is estimated using the autocorrelation function, and voicing is detected using the autocorrelation function method and the energy in the spectrum. Optimal segment boundaries are obtained using a dynamic programming procedure based on the power normalised log-likelihood of the segment. Magnitude-only sinusoidal synthesis is then used to synthesise speech from the estimated spectrum. Using multiple codebooks an average bit-rate of 500 bps has been obtained.
Full Paper Acoustic Example
Bibliographic reference. Zolfaghari, Parham / Robinson, Tony (1997): "A segmental formant vocoder based on linearly varying mixture of Gaussians", In EUROSPEECH-1997, 425-428.