ISCA International Workshop on Speech and Language Technology in Education (SLaTE 2009)

Wroxall Abbey Estate, Warwickshire, England
September 3-5, 2009

Analysis of Vocabulary Difficulty Using Wiktionary

Julie Medero, Mari Ostendorf

Department of Electrical Engineering, University of Washington, Seattle, WA, USA

Assessing vocabulary difficulty is useful for finding and creating texts at low reading levels. Prior work has focused on characteristics such as word length and word frequency. In this work, we explore whether other cues might be useful, using features extracted from Wiktionary entries. Comparing words in comparable articles in Standard and Simple English Wikipedia, we find that words that appear in Standard but not Simple English tend to have shorter definitions, fewer part-of-speech types and word senses, and fewer languages that they have been translated into.

Full Paper

Bibliographic reference.  Medero, Julie / Ostendorf, Mari (2009): "Analysis of vocabulary difficulty using Wiktionary", In SLaTE-2009, 61-64.