Parsing human-human conversations consists in automatically enriching text transcription with semantic structure information. We use in this paper a FrameNet-based approach to semantics that, without needing a full semantic parse of a message, goes further than a simple flat translation of a message into basic concepts. FrameNet-based semantic parsing may follow a syntactic parsing step, however spoken conversations in customer service telephone call centers present very specific characteristics such as non-canonical language, noisy messages (disfluencies, repetitions, truncated words or automatic speech transcription errors) and the presence of superfluous information. For syntactic parsing the traditional view based on context-free grammars is not suitable for processing non-canonical text. New approaches to parsing based on dependency structures and discriminative machine learning techniques are more adapted to process spontaneous speech for two main reasons: (a) they need less training data and (b) the annotation with syntactic dependencies of conversation transcripts is simpler than with syntactic constituents. Another advantage is that partial annotation can be performed. This paper presents the adaptation of a syntactic dependency parser to process very spontaneous speech recorded in a call-centre environment. This parser is used in order to produce FrameNet candidates for characterizing conversations between an operator and a caller.
Bibliographic reference. Bechet, Frederic / Nasr, Alexis / Favre, Benoit (2014): "Adapting dependency parsing to spontaneous speech for open domain spoken language understanding", In INTERSPEECH-2014, 135-139.