ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

Issues in topic identification on the switchboard corpus

John McDonough, Herbert Gish

Topic identification (TID) is the automatic classification of speech messages into one of a known set of possible topics. The TID task can be view as having three principal components: 1) event generation, 2) keyword event selection, and 3) topic modeling. Using data from the Switchboard corpus, we present experimental results for various approaches to the TID problem and compare the relative effectiveness of each. In particular, we examine issues in topic modeling and keyword selection.


Cite as: McDonough, J., Gish, H. (1994) Issues in topic identification on the switchboard corpus. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 2163-2166

@inproceedings{mcdonough94_icslp,
  author={John McDonough and Herbert Gish},
  title={{Issues in topic identification on the switchboard corpus}},
  year=1994,
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},
  pages={2163--2166}
}