5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Building a Statistical Model of the Vowel Space for Phoneticians

Matthew Aylett

Human Communication Research Centre, University of Edinburgh, UK

Vowel space data (A two dimensional F1/F2 plot) is of interest to phoneticians for the purpose of comparing different accents, languages, speaker styles and individual speakers. Current automatic methods used by speech technologists do not generally produce traditional vowel space models; instead they tend to produce hyper dimensional code books covering the entire speakers speech stream. This makes it difficult to relate results generated by these methods to observations in laboratory phonetics. In order to address these problems a model was developed based on a mixture Gaussian density function fitted using expectation maximisation on F1/F2 data producing a probability distribution in F1/F2 space. Speech was pre-processed using voicing to automatically excerpt vowel data without any need for segmentation and a parametric fit algorithm was applied to calculate likely vowel targets. The result was a clear visualisation of a speaker's vowel space requiring no segmented or labelled speech.

Bibliographic reference.  Aylett, Matthew (1998): "Building a statistical model of the vowel space for phoneticians", In ICSLP-1998, paper 0823.