This paper describes a 3-level manual discourse coding scheme that we have devised for manual tagging of the CallHome Spanish (CHS) and CallFriend Spanish (CFS) databases used in the CLARITY project. The goal of CLARITY is to explore the use of discourse structure in understanding conversational speech. The project combines empirical methods for dialogue processing with state-of-the art LVCSR (using the JANUS recognizer). The three levels of the coding scheme are (1) a speech act level consisting of a tag set extended from DAMSL and Switchboard; (2) dialogue game level defined by initiative and intention; and (3) an activity level defined within topic units. The manually tagged dialogues are used to train automatic classifiers. We present preliminary results for automatic speech act classification and topic boundary identification and inter-coder speech act confusion matrices.
Cite as: Levin, L., Thyme-Gobbel, A., Lavie, A., Ries, K., Zechner, K. (1998) A discourse coding scheme for conversational Spanish. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 1000, doi: 10.21437/ICSLP.1998-492
@inproceedings{levin98_icslp, author={Lori Levin and Ann Thyme-Gobbel and Alon Lavie and Klaus Ries and Klaus Zechner}, title={{A discourse coding scheme for conversational Spanish}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 1000}, doi={10.21437/ICSLP.1998-492} }