![]() |
Symposium on Machine Learning in Speech and Language Processing (MLSLP)Bellevue, WA, USA |
![]() |
A key requirement for being able to learn a good classifier is having enough
labeled data. In many situations, however, unlabeled data is easily available but
labels are expensive to come by. In the active learning scenario, each label has a
non-negligible cost, and the goal, starting with a large pool of unlabeled data,
is to adaptively decide which points to label, so that a good classifier is
obtained at low cost.
Many active learning strategies run into severe problems
with sampling bias; the theory has therefore focused on how to correctly manage
this bias while attaining good label complexity. I will summarize recent work in
the machine learning community that achieves this goal through algorithms that are
simple and practical enough to be used in large-scale applications.
Bibliographic reference. Dasgupta, Sanjoy (2011): "Recent advances in active learning", In MLSLP-2011.