Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Comparing Several Models for Perceptual Long-Term Modeling of Amplitude and Phase Trajectories of Sinusoidal Speech

Mohammad Firouzmand (1), Laurent Girin (1), Sylvain Marchand (2)

(1) ICP-CNRS, Grenoble, France; (2) LaBRI-CNRS, France

The so-called Long-Term (LT) modeling of sinusoidal parameters, proposed in previous papers, consists in modeling the entire timetrajectory of amplitude and phase parameters over large sections of voiced speech, differing from usual Short- Term models, which are defined on a frame-by-frame basis. In the present paper, we focus on a specific novel contribution to this general framework: the comparison of four different Long- Term models, namely a polynomial model, a model based on discrete cosine functions, and combinations of discrete cosine with sine functions or polynomials. Their performances are compared in terms of synthesis signal quality, data compression and modeling accuracy, and the interest of the presented study for speech coding is shown.

