ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

Error analysis and confidence measure of Chinese word segmentation

Chih-Chung Kuo, Kun-Yuan Ma

Word segmentation for a Chinese sentence is essential for many applications in language and speech processing. There's no perfect method that could achieve word segmentation without any errors. We propose a confidence measure for the segmentation result to cope with the problem caused by the errors. The effective method depends mainly on the error analysis of the word segmentation. With the confidence measure the suspected errors can be identified such that manual inspection loads can be largely reduced for non-real-time applications. A soft-decision method and a composite-word approach for prosody generation are also designed for text-to-speech systems by exploiting the confidence measure, such that the wrong prosody caused by wrong word boundaries can be alleviated.


doi: 10.21437/ICSLP.1998-32

Cite as: Kuo, C.-C., Ma, K.-Y. (1998) Error analysis and confidence measure of Chinese word segmentation. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 1078, doi: 10.21437/ICSLP.1998-32

@inproceedings{kuo98_icslp,
  author={Chih-Chung Kuo and Kun-Yuan Ma},
  title={{Error analysis and confidence measure of Chinese word segmentation}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 1078},
  doi={10.21437/ICSLP.1998-32}
}