![]() |
ESCA Workshop on ProsodyLund, Sweden |
![]() |
An algorithm for automatic intonation analysis is described. It is based on a two-parameter model of weighted time-averaging with threshold for pitch perception. This model can be considered as a non-linear fitter. In a first stage speech is decomposed into short-duration tonal segments using short-term energy. In a second stage these short-duration tones are analyzed using the numerical model. A set of short (static and dynamic) tones are then obtained, together with their (constant or time-varying) pitches. Stylized F0 contours are reconstructed from this set of tones. Stylized contours are resynthesized, and give synthetic sentences which are perceptually identical with natural sentences.
Bibliographic reference. d'Alessandro, Christophe (1993): "A numerical model of pitch perception for short-duration vocal tones: application to intonation analysis", In Prosody-1993, 234-237.