ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Building an information retrieval system for serbian - challenges and solutions

Miroslav Martinović, Srdjan Vesić, Goran Rakić

We describe challenges encountered while building an information retrieval system for Serbian language. Approaches designed and adopted to handle them are depicted and illuminated in this paper. As a backbone of our system, we used SMART retrieval system which we augmented with features necessary to deal with specificities of the Serbian alphabet. In addition, morphological richness of the language accentuated implications of the text preprocessing phase. During this phase, we devised two algorithms which increased retrieval precision by 14% and 27%, respectively. Testing was conducted using two gigabyte EBART collection of Serbian newspaper articles.


doi: 10.21437/Interspeech.2007-437

Cite as: Martinović, M., Vesić, S., Rakić, G. (2007) Building an information retrieval system for serbian - challenges and solutions. Proc. Interspeech 2007, 1513-1516, doi: 10.21437/Interspeech.2007-437

@inproceedings{martinovic07_interspeech,
  author={Miroslav Martinović and Srdjan Vesić and Goran Rakić},
  title={{Building an information retrieval system for serbian - challenges and solutions}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={1513--1516},
  doi={10.21437/Interspeech.2007-437}
}