ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition
April 13-16, 2003
This paper describes an integrated framework to paraphrase spontaneous speech into written-style sentences. Most current speech recognition systems try to transcribe whole spoken words correctly. However, recognition results of spontaneous speech are usually difficult to understand, even if the recognition is perfect, because spontaneous speech includes redundant information, and it has a different style from that of written sentences. Especially, the style of spoken Japanese is much different from that of written one. Therefore, techniques to paraphrase recognition results are indispensable for generating captions or minutes from speech. To realize efficient speech paraphrasing, we attempt to translate spontaneous speech directly into writtenstyle sentences using a Weighted Finite-State Transducer (WFST). This approach enables to use all the knowledge sources in a one-pass search strategy and reduces the search error, since the constraint of the paraphrasing model is used from the beginning of the search. We conducted experiments on a 20k-word Japanese lecture speech recognition and paraphrasing task. Our approach yielded improvements on both recognition accuracy and paraphrasing accuracy compared with other approaches that deal with speech recognition and paraphrasing performed separately.
Bibliographic reference. Hori, Takaaki / Willett, Daniel / Minami, Yasuhiro (2003): "Paraphrasing spontaneous speech using weighted finite-state transducers", in SSPR-2003, paper TAP13.