Symposium on Machine Learning in Speech and Language Processing (MLSLP)
Bellevue, WA, USA
Submodular functions have a long history in economics, game theory, combinatorial optimization, electrical networks, and operations research. A submodular function operates on subsets of a finite ground set, and has certain properties that make optimization either tractable or approximable whereas otherwise neither would be possible. In this talk, after briefly reviewing relevant properties, we will discuss several applications of submodularity in speech and natural language processing. First, we will see how submodular functions can be used to address the problem of finding a subset of a large speech corpus that is useful for accurately and rapidly prototyping novel and computationally expensive speech recognition architectures. The principle partition of a submodular function allows rapid exploration of the tradeoff between limiting the number and quality of types vs. the number and quality of tokens in such a corpus. Secondly, we present a class of submodular functions useful for document summarization tasks, each combining fidelity and diversity terms. We show better than existing state-of-art results in both generic and query-focused document summarization on the DUC 2004-2007 evaluations. We also show that some well-known methods for document summarization are in fact submodular. Lastly, given time, we may present new methods for active semi-supervised learning on submodular functions.
Bibliographic reference. Bilmes, Jeff / Lin, Hui / Guillory, Andrew (2011): "Applications of submodular functions in speech and NLP", In MLSLP-2011.