This study investigated the dynamic spectral patterns of vowels in Mandarin Chinese using a corpus of monosyllabic words spoken in isolation. Mel-frequency cepstral coefficients (MFCCs) were parameterized in different ways to test the nature of the dynamic information in vowels through automatic vowel classification. Compared to the MFCCs extracted at the vowel midpoint, using the MFCCs extracted at two or three points (vowel onset, offset, and midpoint) greatly improved classification accuracies. Legendre polynomials fitted to the MFCCs over the entire vowel duration achieved approximately 30% relative error reductions over the three-point model. Euclidean cepstral distance was employed to measure the magnitude of spectral change. A negative correlation was found between the rate of spectral change and vowel duration. Vowel-dependent spectral changes appear primarily in the first half of a vowel. There is great diversity among the diphthongs and a considerable overlap between the diphthongs and the monophthongs in terms of the spectral dynamics.
Bibliographic reference. Yuan, Jiahong (2013): "The spectral dynamics of vowels in Mandarin Chinese", In INTERSPEECH-2013, 1193-1197.