8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Translating Conversational Speech to Standard Linguistic Form

Darren Scott Appling (1), Nick Campbell (2)

(1) Georgia Institute of Technology, USA
(2) NICT, Japan

This paper describes the so-called ill-formed nature of spontaneous conversational speech as observed from the study of a 1500-hour corpus of recorded dialogue speech. We note that the structure is quite different from that of more formal speech or writing and propose a Statistical Machine Translation approach for mapping between the spoken and written forms of the language as if they were two entirely separate languages. We further posit that the particular nature of the spoken language is especially well suited for the display of affective states, inter-speaker relationships and discourse management information. In summary, both modes of communication appear to be particularly suited to their pragmatic function, neither is ill-formed, and it appears possible to map automatically between the two. This mapping has applications in speech technology for the processing of conversational speech.

Full Paper

Bibliographic reference.  Appling, Darren Scott / Campbell, Nick (2007): "Translating conversational speech to standard linguistic form", In INTERSPEECH-2007, 2825-2828.