In this study, various German language corpora were compared in order to discover the extent to which syllable frequencies remain stable across different contexts and modalities. Although considerable differences in relative frequency were found among the more common syllables, rank numbers proved to be more robust. Variation across corpora was mostly due to vocabulary characteristics of particular corpus domains rather than to systematic differences between spoken and written language. The results indicate that syllable frequencies in written corpora can be taken as a rough estimate for their frequency in spoken language.
Bibliographic reference. Samlowski, Barbara / Möbius, Bernd / Wagner, Petra (2011): "Comparing syllable frequencies in corpora of written and spoken language", In INTERSPEECH-2011, 637-640.