Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

A Human-Human Train Timetable Dialogue Corpus

Filip Jurcicek, Jiri Zahradil, Libor Jelinek

University of West Bohemia in Pilsen, Czech Republic

This paper describes progress in a development of the humanhuman dialogue corpus. The corpus contains transcribed user's phone calls to a train timetable information center. The phone calls consist of inquiries regarding their train traveler's plans. The corpus is based on dialogues's transcription of user's inquiries that were previously collected for a train timetable information center. We enriched this transcription by dialogue act tags. The dialogue act tags comprehend abstract semantic annotation. The corpus comprises a recorded speech of both operators and users, orthographic transcription, normalized transcription, normalized transcription with named entities, and dialogue act tags with abstract semantic annotation. A combination of a dialogue act tagset and a abstract semantic annotation is proposed. A technique of dialogue act tagging and abstract semantic annotation is described and used.

Full Paper

Bibliographic reference.  Jurcicek, Filip / Zahradil, Jiri / Jelinek, Libor (2005): "A human-human train timetable dialogue corpus", In INTERSPEECH-2005, 1525-1528.