Third European Conference on Speech Communication and Technology

Berlin, Germany
September 22-25, 1993


Modelling Spectral Dynamics for Vowel Classification

William D. Goldenthal, James R. Glass

Spoken Language Systems Group, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA

In this work, we are attempting to develop models which capture the dynamic characteristics and statistical dependencies of acoustic attributes in a segment-based framework. Our approach is based on the creation of a track, fa, for each phonetic unit a. The track serves as a model of the dynamic trajectories of the acoustic attributes over the segment. The tracks attempt to capture segment-level spectral dynamics without making any assumptions concerning the linearity or stationarity of the speech signal. The statistical framework for scoring incorporates the auto- and cross-correlation properties of the track error over time, within a segment. This paper presents the results of a series of vowel classification experiments using the TIMIT acoustic-phonetic corpus. Classification performance of 68.9% was achieved, which compares favorably to other vowel classification experiments using the same corpus.

Full Paper

Bibliographic reference.  Goldenthal, William D. / Glass, James R. (1993): "Modelling spectral dynamics for vowel classification", In EUROSPEECH'93, 289-292.