Visual, Laughter, Applause and Spoken Expression Features for Predicting Engagement Within TED Talks

Fasih Haider, Fahim A. Salim, Saturnino Luz, Carl Vogel, Owen Conlan, Nick Campbell


There is an enormous amount of audio-visual content available on-line in the form of talks and presentations. The prospective users of the content face difficulties in finding the right content for them. However, automatic detection of interesting (engaging vs. non-engaging) content can help users to find the videos according to their preferences. It can also be helpful for a recommendation and personalised video segmentation system. This paper presents a study of engagement based on TED talks (1338 videos) which are rated by on-line viewers (users). It proposes novel models to predict the user’s (on-line viewers) engagement using high-level visual features (camera angles), the audience’s laughter and applause, and the presenter’s speech expressions. The results show that these features contribute towards the prediction of user engagement in these talks. However, finding the engaging speech expressions can also help a system in making summaries of TED Talks (video summarization) and creating feedback to presenters about their speech expressions during talks.


 DOI: 10.21437/Interspeech.2017-1633

Cite as: Haider, F., Salim, F.A., Luz, S., Vogel, C., Conlan, O., Campbell, N. (2017) Visual, Laughter, Applause and Spoken Expression Features for Predicting Engagement Within TED Talks. Proc. Interspeech 2017, 2381-2385, DOI: 10.21437/Interspeech.2017-1633.


@inproceedings{Haider2017,
  author={Fasih Haider and Fahim A. Salim and Saturnino Luz and Carl Vogel and Owen Conlan and Nick Campbell},
  title={Visual, Laughter, Applause and Spoken Expression Features for Predicting Engagement Within TED Talks},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2381--2385},
  doi={10.21437/Interspeech.2017-1633},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1633}
}