In our previous studies, it was found that F0 variations in Cantonese speech can be adequately represented by linear approximations of the observed F0 contours, in the sense that comparable perception with natural speech can be attained. The approximated contours were determined manually. In this study, a framework is developed for automatic approximation of F0 contours. Based on the knowledge learned from perceptual studies, the approximation process is carried out in three steps: contour smoothing, locating turning points and determining F0 values at turning points. Perceptual evaluation was performed on re-synthesized speech of hundreds of Cantonese polysyllabic words. The results show that the proposed framework produces good approximations for the observed F0 contours. For 93% of the utterances, the re-synthesized speech can attain comparable perception to the natural speech.
Bibliographic reference. Li, Yujia / Lee, Tan (2010): "Perception-based automatic approximation of F0 contours in Cantonese speech", In INTERSPEECH-2010, 1425-1428.