EUROSPEECH 2003 - INTERSPEECH 2003
In this paper, we investigate the application of temporal decomposition (TD) technique to describe the temporal patterns of speech excitation parameter contours, i.e. gain, pitch, and voicing. We use a common set of event functions to describe the temporal structure of both spectral and excitation parameters, and then quantize them. Experimental results show that each speech excitation parameter contour can be well described by a set of excitation targets using the event functions obtained from TD analysis of line spectral frequency (LSF) parameters, with considerably low reconstruction error. Moreover, we can efficiently quantize the excitation targets by a combination of two uniform quantizers, one working directly on logarithmic excitation targets and the other working on the difference between current and previous logarithmic excitation targets.
Bibliographic reference. Nguyen, Phu Chien / Akagi, Masato (2003): "Efficient quantization of speech excitation parameters using temporal decomposition", In EUROSPEECH-2003, 449-452.