Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Prosody Recognition from Speech Utterances Using Acoustic and Linguistic Based Models of Prosodic Events

Alistair Conkie, Giuseppe Riccardi, Richard C. Rose

AT&T Labs-Research, Shannon Laboratory, Florham Park, NJ, USA

A system for automatic recognition of prosodic events in speech utterances has been developed and applied to recognizing accent tones as defined by the tone and break index (ToBI) prosodic labeling standard. Both the acoustic and syntactic modeling portions of the system are described in the paper. The acoustic modeling portion of the system involves representation of ToBI labeled events using hidden Markov models (HMMs) that are defined over a set of prosodic features. The syntactic modeling component involves the prediction of prosodic events based on a stochastic finite state model defined over input labels obtained from a part-of-speech (POS) tagger. The system was evaluated in terms of its ability to recognize pitch accents in a single speaker read speech corpus when the orthographic transcription of the utterance was assumed to be known. It was shown to improve average labeling accuracy over a baseline text{only prosodic labeling system from 84.8% to 88.3%.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Conkie, Alistair / Riccardi, Giuseppe / Rose, Richard C. (1999): "Prosody recognition from speech utterances using acoustic and linguistic based models of prosodic events", In EUROSPEECH'99, 523-526.