8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Estimating Detailed Spectral Envelopes Using Articulatory Clustering

Yoshinori Shiga, Simon King

University of Edinburgh, UK

This paper presents an articulatory-acoustic mapping where detailed spectral envelopes are estimated. During the estimation, the harmonics of a range of F0 values are derived from the spectra of multiple voiced speech signals vocalized with similar articulator settings. The envelope formed by these harmonics is represented by a cepstrum, which is computed by fitting the peaks of all the harmonics based on the weighted least square method in the frequency domain. The experimental result shows that the spectral envelopes are estimated with the highest accuracy when the cepstral order is 48-64 for a female speaker, which suggests that representing the real response of the vocal tract requires high-quefrency elements that conventional speech synthesis methods are forced to discard in order to eliminate the pitch component of speech.

Full Paper

Bibliographic reference.  Shiga, Yoshinori / King, Simon (2004): "Estimating detailed spectral envelopes using articulatory clustering", In INTERSPEECH-2004, 2485-2488.