Symposium on Machine Learning in Speech and Language Processing (MLSLP)
Bellevue, WA, USA
Natural language processing is obstructed by two problems: that of ambiguity,
and that of skewed distributions. Together they engender acute sparsity of data
for supervised learning, both of grammars and parsing models.
The paper expresses some pessimism about the prospects for getting around this problem using unsupervised methods, and considers the prospects for finding naturally labeled datasets to extend supervised methods.
Bibliographic reference. Steedman, Mark (2011): "Some open problems in machine learning for NLP", In MLSLP-2011 (abstract).