INTERSPEECH 2004 - ICSLP
This paper addresses automatic transformation from spoken style texts to written style texts. Exact transcriptions and speech recoginition results of live lectures include many spoken language expressions, and thus, are not suitable for documents and need to be edited. In this paper, we present a method of applying of the statistical approach used in machine translation to this postprocessing task. Specifically, we implement the correction of colloquial expressions, the delection of fillers, the insertion of periods, and the insertion of particles in an integrated manner. A preliminaly evaluation confirms that the statistical transformation framework works well and we achieved high recall and precision rate of period and particle insertion.
Bibliographic reference. Kawahara, Tatsuya / Shitaoka, Kazuya / Nanjo, Hiroaki (2004): "Automatic transformation of lecture transcription into document style using statistical framework", In INTERSPEECH-2004, 2881-2884.