ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

A soft-clustering algorithm for automatic induction of semantic classes

Elias Iosif, Alexandros Potamianos

In this paper, we propose a soft-decision, unsupervised clustering algorithm that generates semantic classes automatically using the probability of class membership for each word, rather than deterministically assigning a word to a semantic class. Semantic classes are induced using an unsupervised, automatic procedure that uses a context-based similarity distance to measure semantic similarity between words. The proposed soft-decision algorithm is compared with various "hard" clustering algorithms, e.g., [1], and it is shown to improve semantic class induction performance in terms of both precision and recall for a travel reservation corpus. It is also shown that additional performance improvement is achieved by combining (auto-induced) semantic with lexical information to derive the semantic similarity distance.


doi: 10.21437/Interspeech.2007-449

Cite as: Iosif, E., Potamianos, A. (2007) A soft-clustering algorithm for automatic induction of semantic classes. Proc. Interspeech 2007, 1609-1612, doi: 10.21437/Interspeech.2007-449

@inproceedings{iosif07_interspeech,
  author={Elias Iosif and Alexandros Potamianos},
  title={{A soft-clustering algorithm for automatic induction of semantic classes}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={1609--1612},
  doi={10.21437/Interspeech.2007-449}
}