16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Topic Modeling for Conference Analytics

Pengfei Liu (1), Shoaib Jameel (1), Wai Lam (1), Bin Ma (2), Helen Meng (1)

(1) Chinese University of Hong Kong, China
(2) A*STAR, Singapore

This work presents our attempt to understand the research topics that characterize the papers submitted to a conference, by using topic modeling and data visualization techniques. We infer the latent topics from the abstracts of all the papers submitted to Interspeech2014 by means of Latent Dirichlet Allocation. Per-topic word distributions thus obtained are visualized through word clouds. We also compare the automatically inferred topics against the expert-defined topics (also known as tracks for Interspeech2014). The comparison is based on an information retrieval framework, where we use each latent topic as a query and each track as a document. For each latent topic, we retrieve a ranked list of tracks scored by the degree of word overlap. Each latent topic is associated with the top-scoring track. This analytic procedure was applied to all submissions to Interspeech2014 and sheds some interesting light in terms of providing an overview of topic categorization in the conference, popular versus unpopular topics, emerging topics and topic compositions. Such insights are potentially valuable for understanding the technical content of a field and planning the future development of its conference(s).

Full Paper

Bibliographic reference.  Liu, Pengfei / Jameel, Shoaib / Lam, Wai / Ma, Bin / Meng, Helen (2015): "Topic modeling for conference analytics", In INTERSPEECH-2015, 707-711.