In the literature, languages have been identified as having more or less transparent orthographies, depending on the degree of predictability of their spelling-to-sound correspondences. Quantitative measures based on large-scaled language corpora which are capable to objectively assess such cross-linguistic variation are rather scarce. The quantitative assessment method presented here builds on the correlation between distances of phonemic and graphemic frequency distributions of a given sample and similar distances obtained from large corpora of the same language. The metric itself may be used as a research tool to investigate the potential effect of orthographic transparency on the development and performance of reading in different populations.
Bibliographic reference. Coene, Martine / Hammer, Annemiek / Kowalczyk, Wojtek / Bosch, Louis ten / Vaerenberg, Bart / Govaerts, Paul J. (2013): "Quantifying cross-linguistic variation in grapheme-to-phoneme mapping", In INTERSPEECH-2013, 1854-1857.