Spoken language understanding (SLU) in today's conversational systems focuses on recognizing a set of domains, intents, and associated arguments, that are determined by application developers. User requests that are not covered by these are usually directed to search engines, and may remain unhandled. We propose a method that aims to find common user intents amongst these uncovered, out-of-domain utterances, with the goal of supporting future phases of dialog system design. Our approach relies on finding common semantic patterns in uncovered user utterances using an Abstract Meaning Representation based semantic parser. We represent the corpus as a graph and find subgraphs that represent clusters, by pruning the corpus graph according to frequency and entropy. We employ crowd-workers to select and label the resulting clusters and compare resulting clusters with two baselines. Experimental analyses show that we obtain higher coverage and accuracy with the semantic parsing based clustering method. Furthermore, since the intents and candidate slots are already induced, these utterances can also be used in unsupervised SLU modeling. In intent classification experiments, we show that the statistical model trained using the clusters formed by this approach results in higher classification F-measure (showing about 25% relative improvement) in comparison to the alternatives.
Bibliographic reference. Hakkani-Tür, Dilek / Ju, Yun-Cheng / Zweig, Geoffrey / Tur, Gokhan (2015): "Clustering novel intents in a conversational interaction system with semantic parsing", In INTERSPEECH-2015, 1854-1858.