Abstract
As the number of non-English documents is increasing dramatically on the web nowadays, the study and design of information retrieval systems for these languages is very important. The Persian language is the official language of Iran, Afghanistan and Tajikistan and is also spoken in some other countries in the Middle East, so there are significant amount of Persian documents available on the web. In this study, we will present and compare our English-Persian cross language text retrieval experiments on Hamshahri text collection. Also, we will present Combinatorial Translation Probability (CTP) calculation method for query translation that estimates translation probabilities based on the collection itself.
Publication Details
This conference paper was originally published as AleAhmad, A, Amiri, H, Rahzogar, M and Oroumchian, F, Experiments with English-Persian text retrieval, in Proceedings of iNews'08 - the 2nd ACM workshop on improving non english web searching, 30 October 2008, California, 77- 80.