9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Recognizing Named Entities in Spoken Chinese Dialogues with a Character-Level Maximum Entropy Tagger

Changchun Bao, Weiqun Xu, Yonghong Yan

Chinese Academy of Sciences, China

Named Entity Recognition (NER) is an important task in information extraction, where major attention has been paid to written texts of a news or academic paper (esp. biomedical) style. In this paper we report the first piece of work on NER in spoken Chinese dialogues, as a preliminary step for spoken language understanding. The NER task is taken as a sequential classification problem and solved with a character-level maximum entropy (maxent) tagger. Despite that spoken data seems noisier than written data, with a set of carefully selected features, the maxent tagger achieves an overall F1 score of 91.87 on our dialogue data.

Full Paper

Bibliographic reference.  Bao, Changchun / Xu, Weiqun / Yan, Yonghong (2008): "Recognizing named entities in spoken Chinese dialogues with a character-level maximum entropy tagger", In INTERSPEECH-2008, 1145-1148.