ISCA Archive ISCSLP 2004
ISCA Archive ISCSLP 2004

Analysis of Paraphrased Corpus and Lexical-Based Approach to Chinese Paraphrasing

Yan Zhang, Hideki Kashioka

In this paper, we firstly analyze the language phenomena and distribution characteristics of Chinese spontaneous utterances already paraphrased by other approaches. Based on the information obtained from a corpus, our lexical-based approach is proposed to paraphrase Chinese spoken language. Our purpose is to transform various expressions into simplified expressions with the same meanings. Chinese verbs are the main constituents in sentences, and with their flexibility they play an important role in expressing structures, especially for transitive verbs. Furthermore, negative verb expressions also appear frequently to express enquiries in question utterances. Therefore, we design four types of paraphrasing templates based on lexical information and the characteristics of the corpus: (1) synonym replacement, (2) Chinese transitive verbs, (3) verbs with two objects, and (4) the transformation of negative expressions. Our experiment found that the lexical-based approach is effective for Chinese paraphrasing.


Cite as: Zhang, Y., Kashioka, H. (2004) Analysis of Paraphrased Corpus and Lexical-Based Approach to Chinese Paraphrasing. Proc. International Symposium on Chinese Spoken Language Processing, 325-328

@inproceedings{zhang04b_iscslp,
  author={Yan Zhang and Hideki Kashioka},
  title={{Analysis of Paraphrased Corpus and Lexical-Based Approach to Chinese Paraphrasing}},
  year=2004,
  booktitle={Proc. International Symposium on Chinese Spoken Language Processing},
  pages={325--328}
}