In this work, we are attempting to develop models which capture the dynamic characteristics and statistical dependencies of acoustic attributes in a segment-based framework. Our approach is based on the creation of a track, fa, for each phonetic unit a. The track serves as a model of the dynamic trajectories of the acoustic attributes over the segment. The tracks attempt to capture segment-level spectral dynamics without making any assumptions concerning the linearity or stationarity of the speech signal. The statistical framework for scoring incorporates the auto- and cross-correlation properties of the track error over time, within a segment. This paper presents the results of a series of vowel classification experiments using the TIMIT acoustic-phonetic corpus. Classification performance of 68.9% was achieved, which compares favorably to other vowel classification experiments using the same corpus.
Bibliographic reference. Goldenthal, William D. / Glass, James R. (1993): "Modelling spectral dynamics for vowel classification", In EUROSPEECH'93, 289-292.