This paper proposes a two-stage story segmentation and detection approach on Mandarin broadcast news. In the two-stage paradigm, a topic classifier is first constructed to find the topic on the broadcast news within a sliding window and determine the potential story boundaries. Then, the problem for story segmentation is transformed to the determination of a chromosome (number sequence) in a search space. The genetic algorithm is then adopted to globally determine the chromosome, which represents the final story boundaries. A topic strength measure is defined as the fitness function used in the genetic algorithm. In order to evaluate our proposed approach, the word-based and syllable-based story segmentation systems were constructed. Experimental results show our proposed method achieves a better performance with 32.94% missing probability and 22.83% false alarm probability compared to the Makhouls method for the segmentation and detection on Mandarin broadcast news.
Cite as: Hsieh, J.-H., Wu, C.-H., Fung, K.-A. (2003) Two-stage story segmentation and detection on broadcast news using genetic algorithm. Proc. ISCA Workshop on Multilingual Spoken Document Retrieval (MSDR 2003), 55-60
@inproceedings{hsieh03_msdr, author={Jia-Hsin Hsieh and Chung-Hsien Wu and Kuao-Ann Fung}, title={{Two-stage story segmentation and detection on broadcast news using genetic algorithm}}, year=2003, booktitle={Proc. ISCA Workshop on Multilingual Spoken Document Retrieval (MSDR 2003)}, pages={55--60} }