In this paper, we address issues that arise when crowdsourcing data collection of user queries to situated dialog systems in a moving car. Compared to unimodal spoken dialog systems such as systems for smartphones, collecting dialog data for situated dialog systems is more costly because a clear awareness of the physical surroundings is required for the user to make realistic queries. We consider the use of crowdsourcing to collect them. To elicit queries from crowd workers, we propose methods of prompting them using visual information. The queries collected using the crowdsourcing methods are compared to those collected using a real situated dialog system. Specifically, we evaluate them based on several performance measures of similarity in semantic content, naturalness of language expression and bias of the collected data. We demonstrate that our crowdsourcing method produced a better language resource in terms of the similarity of the text to real user utterances than those generated by a handcrafted grammar.
Bibliographic reference. Misu, Teruhisa (2014): "Crowdsourcing for situated dialog systems in a moving car", In INTERSPEECH-2014, 125-129.