ISCA Archive ISCSLP 2004
ISCA Archive ISCSLP 2004

An Initial Prototype System for Chinese Spoken Document Understanding and Organization for Indexing/Browsing and Retrieval Applications

LinShan Lee, ShunChuan Chen, Yuan Ho, JiaFu Chen, MingHan Li, Tehsuan Li

In the future, the network content will include all knowledge, information, services relevant to our daily life. The most attractive form of future network content will be multi-media, which usually includes voice information. As long as the voice information is included, it usually carries the core concepts for the content. As a result, the spoken documents associated with the multi-media content very possibly can serve as the key for indexing/browsing and retrieval. However, unlike the written documents, the multi-media or voice information are very often just audio/video signals. They are very difficult to index, browse or retrieve, since the users can't go through each of them from the beginning to the end during browsing. A possible approach then may be to segment the audio/video signals automatically into short paragraphs, each with a central concept or topic, and then automatically generate a title and/or a summary for each of these short paragraphs, in either speech or text form. The topics and central concepts described in the segmented short paragraphs are then further analyzed and organized into some graphic structures describing the relationships among these topics and central concepts. In this way, the multi-media content can be much more efficiently indexed automatically and browsed and retrieved by the user based on the title, summary and the graphic structure. This is referred to as the understanding and organization of spoken documents here. In this paper, an initial prototype system for such functions with broadcast news taken as the example multi-media content was presented. The graphic structure used to describe the relationships among the topics and central concepts are 2-dimensional tree structures developed based on the probabilistic latent semantic analysis.


Cite as: Lee, L., Chen, S., Ho, Y., Chen, J., Li, M., Li, T. (2004) An Initial Prototype System for Chinese Spoken Document Understanding and Organization for Indexing/Browsing and Retrieval Applications. Proc. International Symposium on Chinese Spoken Language Processing, 329-332

@inproceedings{lee04_iscslp,
  author={LinShan Lee and ShunChuan Chen and Yuan Ho and JiaFu Chen and MingHan Li and Tehsuan Li},
  title={{An Initial Prototype System for Chinese Spoken Document Understanding and Organization for Indexing/Browsing and Retrieval Applications}},
  year=2004,
  booktitle={Proc. International Symposium on Chinese Spoken Language Processing},
  pages={329--332}
}