EUROSPEECH 2003 - INTERSPEECH 2003
For the present paper, a Bayesian probabilistic framework for the task of automatic acquisition of intonational phrase breaks was established. By considering two different conditional independence assumptions, the naive Bayes and Bayesian networks approaches were regarded and evaluated against the CART algorithm, which has been previously used with success. A finite length window of minimal morphological and syntactic resources was incorporated, i.e. the POS label and the kind of phrase boundary, a novel syntactic feature that has not been applied to intonational phrase break detection before. This feature can be used in languages where syntactic parsers are not available and proves to be important, not only for the proposed Bayesian methodologies but for other algorithms, like CART. Trained on a 5500 word database, Bayesian networks proved to be the most effective in terms of precision (82,3%) and recall (77,2%) for predicting phrase breaks.
Bibliographic reference. Zervas, P. / Maragoudakis, M. / Fakotakis, Nikos / Kokkinakis, George (2003): "Bayesian induction of intonational phrase breaks", In EUROSPEECH-2003, 113-116.