This paper describes the so-called ill-formed nature of spontaneous conversational speech as observed from the study of a 1500-hour corpus of recorded dialogue speech. We note that the structure is quite different from that of more formal speech or writing and propose a Statistical Machine Translation approach for mapping between the spoken and written forms of the language as if they were two entirely separate languages. We further posit that the particular nature of the spoken language is especially well suited for the display of affective states, inter-speaker relationships and discourse management information. In summary, both modes of communication appear to be particularly suited to their pragmatic function, neither is ill-formed, and it appears possible to map automatically between the two. This mapping has applications in speech technology for the processing of conversational speech.
Bibliographic reference. Appling, Darren Scott / Campbell, Nick (2007): "Translating conversational speech to standard linguistic form", In INTERSPEECH-2007, 2825-2828.