16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Automatic Audio Sentiment Extraction Using Keyword Spotting

Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen

University of Texas at Dallas, USA

Most existing methods for audio sentiment analysis use automatic speech recognition to convert speech to text, and feed the textual input to text-based sentiment classifiers. This study shows that such methods may not be optimal, and proposes an alternate architecture where a single keyword spotting system (KWS) is developed for sentiment detection. In the new architecture, the text-based sentiment classifier is utilized to automatically determine the most powerful sentiment-bearing terms, which is then used as the term list for KWS. In order to obtain a compact yet powerful term list, a new method is proposed to reduce text-based sentiment classifier model complexity while maintaining good classification accuracy. Finally, the term list information is utilized to build a more focused language model for the speech recognition system. The result is a single integrated solution which is focused on vocabulary that directly impacts classification. The proposed solution is evaluated on videos from and UT-Opinion corpus (which contains naturalistic opinionated audio collected in real-world conditions). Our experimental results show that the KWS based system significantly outperforms the traditional architecture in difficult practical tasks.

