8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

People Watcher: A Game for Eliciting Human-Transcribed Data for Automated Directory Assistance

Tim Paek, Yun-Cheng Ju, Christopher Meek

Microsoft Research, USA

Automated Directory Assistance (ADA) allows users to request telephone or address information of residential and business listings using speech recognition. Because callers often express listings differently than how they are registered in the directory, ADA systems require transcriptions of alternative phrasings for directory listings as training data, which can be costly to acquire. As such, a framework in which data can be contributed voluntarily by large numbers of Internet users has tremendous value. In this paper, we introduce People Watcher, a computer game that elicits transcribed, alternative user phrasings for directory listings while at the same time entertaining players. Data generated from the game not only overlapped actual audio transcriptions, but resulted in a statistically significant 15% relative reduction in semantic error rate when utilized for ADA. Furthermore, semantic accuracy was not statistically different than using the actual audio transcriptions.

Full Paper

Bibliographic reference.  Paek, Tim / Ju, Yun-Cheng / Meek, Christopher (2007): "People watcher: a game for eliciting human-transcribed data for automated directory assistance", In INTERSPEECH-2007, 1322-1325.