INTERSPEECH 2004 - ICSLP
We model Mandarin phrase break prediction as a classification problem with three level prosodic structures and apply conditional maximum entropy classification to this problem. We acquire multiple levels of linguistic knowledge from an annotated corpus to become well-integrated features for maximum entropy framework. Five kinds of features were used to represent various linguistic constraints including POS tag features, lexical features, phonetic features, length features, and distance features. Experiment results show that our method performs better than the previous methods and the conditional maximum entropy (ME) model is very effective for data sparseness problem in Mandarin phrase break prediction.
Bibliographic reference. Zheng, Yu / Lee, Gary Geunbae / Kim, Byeongchang (2004): "Using multiple linguistic features for Mandarin phrase break prediction in maximum-entropy classification framework", In INTERSPEECH-2004, 737.