Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
Generalized linear models (GLM) provide a flexible framework for investigating many long-standing issues concerning the relation of FO and formant frequencies to vowel categories. These issues include the choice of frequency scale (e.g., log Hz, Bark, ERB), the effect of F0, and the importance of longer-term, speaker-dependent extrinsic information such as formant ranges or average fundamental frequency to the specification of vowel quality. As noted in [2, 31 the differences in the empirical consequences of alternate models can be quite subtle. The present paper illustrates how GLM and related techniques may inform the choice among competing approaches from three perspectives: the modeling of patterns in production data, data-analytic pattern recognition, and direct perceptual modeling. Analysis of Fl patterns from the individual Peterson and Barney  data indicates a small but consistent advantage to the log scale over the other two and a preference for extrinsic, formant-average information as a normalization parameter. Simple pattern recognition schemes show relatively little difference among the alternate scales. Initial perceptual modeling via logistic regression on the data from  also fails to provide evidence for a substantial difference among the scales. Perceptual experiments analogous to those of  specifically designed to be sensitive to the relatively subtle differences among the scales will likely be necessary to resolve the issue of "the best scale" for the representation of vowel quality.
Bibliographic reference. Nearey, Terrance M. (1992): "Applications of generalized linear modeling to vowel data", In ICSLP-1992, 583-586.