In China, there are hundred kinds of dialects. By traditional dialectology, they are classified into seven big dialect regions and most of them also have many sub-dialects and sub-sub-dialects. As they are different in various linguistic aspects, people from different dialect regions often cannot communicate orally. But for the sub-dialects of one dialect region, although they are sometimes still mutually unintelligible, more common features are shared. In this paper, a dialect pronunciation structure, which has been used successfully in dialect-based speaker classification in our previous work , is examined for the task of speaker classification and distance measurement among cities based on sub-dialects of Mandarin. Using the finals of the dialectal utterances of a specific list of written characters, a dialect pronunciation structure is built for every speaker in a data set and these speakers are classified based on the distances among their structures. Then, the results of classifying 16 Mandarin speakers based on their sub-dialects show that they are linguistically classified with little influence of their age and gender. Finally, distances among sub-sub-dialects are similarly calculated and evaluated. All the results show high validity and accordance to linguistic studies.
Bibliographic reference. Ma, Xuebin / Nemoto, Akira / Minematsu, Nobuaki / Qiao, Yu / Hirose, Keikichi (2009): "Structural analysis of dialects, sub-dialects and sub-sub-dialects of Chinese", In INTERSPEECH-2009, 2219-2222.