2nd Workshop on Spoken Language Technologies for Under-Resourced Languages
Universiti Sains, Penang, Malaysia
This paper introduces a language identification approach using syllable structure information. We also review and compare other approaches. Most of these approaches use linguistic information for language identification. The information used for language identification is Malay affixation information, English vocabulary list, alphabet ngram, grapheme n-gram. The approach using syllable structure information has the highest accuracy at 93.73% compared to other approaches. Based on the accuracy result of comparison, by using syllable structure 1.91% accuracy had increased for language identification compare with the second higher result in this paper. Syllable structure information is able to gain a better result for language identification.
Index Terms: Language identification, code switching, syllable structure information, Malay, English
Bibliographic reference. Yeong, Yin-Lai / Tan, Tien-Ping (2010): "Language identification of code switching Malay-English words using syllable structure information", In SLTU-2010, 142-145.