The localization of speech recognition for large-scale spoken dialog systems can be a tremendous exercise. Usually, all involved grammars have to be translated by a language expert, and new data has to be collected, transcribed, and annotated for statistical utterance classifiers resulting in a time-consuming and expensive undertaking. Often though, a vast number of transcribed and annotated utterances exists for the source language. In this paper, we propose to use such data and translate it into the target language using machine translation. The translated utterances and their associated (original) annotations are then used to train statistical grammars for all contexts of the target system. As an example, we localize an English spoken dialog system for Internet troubleshooting to Spanish by translating more than 4 million source utterances without any human intervention. In an application of the localized system to more than 10,000 utterances collected on a similar Spanish Internet troubleshooting system, we show that the overall accuracy was only 5.7% worse than that of the English source system.
Bibliographic reference. Suendermann, David / Liscombe, Jackson / Dayanidhi, Krishna / Pieraccini, Roberto (2009): "Localization of speech recognition in spoken dialog systems: how machine translation can make our lives easier", In INTERSPEECH-2009, 1475-1478.