Our previous work shows strong prosodic characteristics are present in tonal and pitch accent languages leading to better performance in detecting these languages. This study uses an entropy-based approach to analyze prosodic features for effective modeling. 17 tonal or pitch accent languages, including a number of under-resourced languages in Africa, are studied. Prosodic trigrams are rated as either strong, moderate or weak according to the language-specific information they contain. The three-level rating helps to find the most efficient prosodic trigrams for language recognition. The feature inventory is reduced by 80% while performance degradation is acceptable. Important prosodic attributes found by analysis reflect the linguistic facts in different languages in nice manners. With this analysis method, selection to an expanded prosodic feature inventory can be done to explore better performance in detecting non-tonal languages.
Index Terms: Language recognition, entropy, tonal languages, pitch accent languages, under-resourced languages
Cite as: Ng, R.W.M., Leung, C.-C., Lee, T., Ma, B., Li, H. (2010) An entropy-based approach for comparing prosodic properties in tonal and pitch accent languages. Proc. Speech Prosody 2010, paper 093
@inproceedings{ng10_speechprosody, author={Raymond W. M. Ng and Cheung-Chi Leung and Tan Lee and Bin Ma and Haizhou Li}, title={{An entropy-based approach for comparing prosodic properties in tonal and pitch accent languages}}, year=2010, booktitle={Proc. Speech Prosody 2010}, pages={paper 093} }