We describe challenges encountered while building an information retrieval system for Serbian language. Approaches designed and adopted to handle them are depicted and illuminated in this paper. As a backbone of our system, we used SMART retrieval system which we augmented with features necessary to deal with specificities of the Serbian alphabet. In addition, morphological richness of the language accentuated implications of the text preprocessing phase. During this phase, we devised two algorithms which increased retrieval precision by 14% and 27%, respectively. Testing was conducted using two gigabyte EBART collection of Serbian newspaper articles.
Bibliographic reference. Martinović, Miroslav / Vesić, Srdjan / Rakić, Goran (2007): "Building an information retrieval system for Serbian - challenges and solutions", In INTERSPEECH-2007, 1513-1516.