5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Modeling the Microprosody of Pitch and Loudness for Speech Synthesis with Neural Networks

Martti Vainio (1), Toomas Altosaar (2)

(1) Department of Phonetics, University of Helsinki, Finland
(2) Acoustics Laboratory, Helsinki University of Technology, Finland

In this study of Finnish microprosody, two prosodic parameters --- pitch and loudness --- were modeled with artificial neural networks. The networks are of the general feed forward type trained with backpropagation. For each phoneme, the network predicts a series of either pitch or loudness values on the basis of information of the phoneme's phonologically motivated features and its phonetic environment. The tests we have run so far indicate that the neural networks are highly successful and accurate in modeling the micro-level behavior of both pitch and loudness. The tests were conducted on isolated word material but some preliminary results obtained from sentence material are also discussed.

Full Paper

Bibliographic reference.  Vainio, Martti / Altosaar, Toomas (1998): "Modeling the microprosody of pitch and loudness for speech synthesis with neural networks", In ICSLP-1998, paper 0886.