8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Why is the Special Structure of the Language Important for Chinese Spoken Language Processing? - Examples on Spoken Document Retrieval, Segmentation and Summarization

Lin-shan Lee, Yuan Ho, Jia-fu Chen, Shun-Chuan Chen

National Taiwan University, Taiwan

The Chinese language is not only spoken by the largest population in the world, but quite different from many western languages with a very special structure. It is not alphabetic: large number of Chinese characters are ideographic symbols and pronounced as monosyllables. The open vocabulary nature, the flexible wording structure and the tone behavior are also good examples within the special structure. It is believed that better results and performance will be obtainable in developing Chinese spoken language processing technologies, if this special structure can be taken into account. In this paper, a set of "feature units" for Chinese spoken language processing is identified, and the retrieval, segmentation and summarization of Chinese spoken documents are taken as examples in analyzing the use of such "feature units". Experimental results indicate that by careful considerations of the special structure and proper choice of the "feature units", significantly better performance can be achieved.

Full Paper

Bibliographic reference.  Lee, Lin-shan / Ho, Yuan / Chen, Jia-fu / Chen, Shun-Chuan (2003): "Why is the special structure of the language important for Chinese spoken language processing? - examples on spoken document retrieval, segmentation and summarization", In EUROSPEECH-2003, 49-52.