ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Multilingual prosody modelling using cascades of regression trees and neural networks

J. W. A. Fackrell, H. Vereecken, J.-P. Martens, Bert Van Coile

This paper describes the use of automatically-trained models (Regression Trees and Multilayer Perceptrons) to predict three prosodic variables – phrase-boundary strength, word prominence and phoneme duration. The models are arranged in a cascade so that the predictions of phrase-boundaries are used as input features to the prominence model, and so on. Cascade models of this type have been constructed for 6 languages, using specially constructed databases, and objective performance statistics are described. For two languages (American English and Dutch) the results of a subjective evaluation experiment suggest that these prosodic models are at least as good as hand-crafted models, and sometimes better. Furthermore, preparing the training data automatically, rather than by manual labelling, seems to have no negative impact on the model performance.


doi: 10.21437/Eurospeech.1999-400

Cite as: Fackrell, J.W.A., Vereecken, H., Martens, J.-P., Coile, B.V. (1999) Multilingual prosody modelling using cascades of regression trees and neural networks. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 1835-1838, doi: 10.21437/Eurospeech.1999-400

@inproceedings{fackrell99_eurospeech,
  author={J. W. A. Fackrell and H. Vereecken and J.-P. Martens and Bert Van Coile},
  title={{Multilingual prosody modelling using cascades of regression trees and neural networks}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={1835--1838},
  doi={10.21437/Eurospeech.1999-400}
}