8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Exploiting Prosodic Features for Dialog Act Tagging in a Discriminative Modeling Framework

Vivek Rangarajan (1), Srinivas Bangalore (2), Shrikanth S. Narayanan (1)

(1) University of Southern California, USA
(2) AT&T Research Labs, USA

Cue-based automatic dialog act tagging uses lexical, syntactic and prosodic knowledge in the identification of dialog acts. In this paper, we propose a discriminative framework for automatic dialog act tagging using maximum entropy modeling. We propose two schemes for integrating prosody in our modeling framework: (i) Syntax-based categorical prosody prediction from an automatic prosody labeler, (ii) A novel method to model continuous acoustic-prosodic observation sequence as a discrete sequence through the means of quantization. The proposed prosodic feature integration results in a relative improvement of 11.8% over using lexical and syntactic features alone on the Switchboard-DAMSL corpus. The performance of using the lexical, syntactic and prosodic features results in an dialog act tagging accuracy of 84.1%, close to the human agreement of 84%.

Full Paper

Bibliographic reference.  Rangarajan, Vivek / Bangalore, Srinivas / Narayanan, Shrikanth S. (2007): "Exploiting prosodic features for dialog act tagging in a discriminative modeling framework", In INTERSPEECH-2007, 150-153.