This paper describes an approach to exploiting within speaker correlations among different speech sounds for phonetic recognition. By incorporating speaker-specific models and speaker-specific constraints to varying degrees, four different paradigms were suggested. These paradigms were empirically evaluated on the task of classifying eight vowels in American English, using nearly 20,000 vowel tokens excised from the TIMIT corpus: Two speaker classes, representing the male and female speakers, were used. The results suggest that by incorporating gender-specific constraints, one can improve on the performance based on gender-specific models alone.
Bibliographic reference. Niyogi, Partha / Zue, Victor W. (1991): "Correlation analysis of vowels and their application to speech recognition", In EUROSPEECH-1991, 1253-1256.