This paper addresses the task of automatic evaluation of spoken fluency skills of a speaker. Specifically, the paper evaluates the role of language models built from fluent and disfluent data in quantifying the fluency of a spoken monologue. We show that features based on relative perplexities of the fluent and the disfluent language models on a given utterance are indicative of the level of spoken fluency of the utterance. The proposed features lead to a spoken fluency classification accuracy of 39.8% for 4-class and 68.4% for 2-class classification. Combining these features with a set of prosodic features leads to further improvement in the classification accuracy thus highlighting the complementarity of the information they contribute compared to the low-level disfluency information captured by the prosodic features.
Bibliographic reference. Deshmukh, Om D. / Doddala, Harish / Verma, Ashish / Visweswariah, Karthik (2010): "Role of language models in spoken fluency evaluation", In INTERSPEECH-2010, 2866-2869.