In this paper, the correlation between cepstrum coe.cients and fundamental frequencies (F0) is quantitatively analyzed. One of our previous studies pointed out that cepstrum coefficients of vowel sounds are varied because of F0 changes and that the variation can be modeled by the multivariate regression analysis. After this previous study, the current work is focused upon the analysis of the correlation in voiced consonant sounds, that in unvoiced consonant sounds, and the dependency of the correlation on speakers/phonemes. After these analyses, several experiments are carried out to examine whether the models built for characterizing the correlation can be used for speech recognition or not. Results show that the distance between distributions of two similar phones, such as /s/ and /z/, and /m/ and /n/, is significantly increased by applying the models.
Cite as: Minematsu, N., Tsuda, K., Hirose, K. (2001) Quantitative analysis of F0-induced variations of cepstrum coefficients. Proc. ITRW on Prosody in Speech Recognition and Understanding, paper 21
@inproceedings{minematsu01_prosody, author={Nobuaki Minematsu and Keiichi Tsuda and Keikichi Hirose}, title={{Quantitative analysis of F0-induced variations of cepstrum coefficients}}, year=2001, booktitle={Proc. ITRW on Prosody in Speech Recognition and Understanding}, pages={paper 21} }