In this paper, we describe and compare systems for text normalization based on SMT methods which are constructed with the support of internet users. By normalizing text displayed in a web interface, internet users provide a parallel corpus of normalized and non-normalized text. With this corpus, SMT models are generated to translate non-normalized into normalized text. To build traditional language-specific text normalization systems, knowledge of linguistics as well as established computer skills to implement text normalization rules are required. Our systems are built without profound computer knowledge due to the simple self-explanatory user interface and the automatic generation of the SMT models. Additionally, no inhouse knowledge of the language to normalize is required due to the multilingual expertise of the internet community. All techniques are applied on French texts, crawled with our Rapid Language Adaptation Toolkit  and compared through Levenshtein edit distance, BLEU score and perplexity.  Tanja Schultz and Alan Black. Rapid Language Adaptation Tools and Technologies for Multilingual Speech Processing. In: Proc. ICASSP Las Vegas, NV 2008.
Bibliographic reference. Schlippe, Tim / Zhu, Chenfei / Gebhardt, Jan / Schultz, Tanja (2010): "Text normalization based on statistical machine translation and internet user support", In INTERSPEECH-2010, 1816-1819.