Sixth European Conference on Speech Communication and Technology
(EUROSPEECH'99)

Budapest, Hungary
September 5-9, 1999

Modelling Intonational Phrase Structure with Artificial Neural Networks

Grazyna Demenko (1), Wiktor Jassem (2)

(1) A. Mickiewicz University in Poznan, Institute of Linguistics, Poland
(2) Polish Academy of Sciences, Institute of Fundamental Technological Research, Poznan, Poland

A model of intonation for Polish has been created on the basis of a general theory of suprasegmentals and on experiments using isolated utterances as well as continuous speech. An intonational phrase consists of an optional prenuclear tune and an obligatory nuclear tune. A training of a three-layer MLP network was performed distinguishing 9 nuclear accents: HL, ML, LL, HM, LH, LM, MH, MM, LHL and 2 secondary prenuclear accents: High (H) and Low (L). A total of 1600 structures (in constructed phrases) were used for training, and 430 for verification. The average score for training and testing was 82 percent. In continuous speech the following structures were postulated: H and L for prenuclear intonation and for nuclear intonations: R (rising), F (falling), MM (level), LHL (rising-falling). For the testing set, a score between 79 and 83 per cent was obtained. In both classifications, an 11-element vector was used to describe the intonational structures under analysis.


Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Demenko, Grazyna / Jassem, Wiktor (1999): "Modelling intonational phrase structure with artificial neural networks", In EUROSPEECH'99, 711-714.