Abstract
We show that any approach to developing optimum retrieval functions is based on two kinds of assumptions: first, a certain form of representation for documents and requests, and second, additional simplifying assumptions that predefine the type of the retrieval function. Then we describe an approach for the development of optimum polynomial retrieval functions: request-document pairs (fl, dm) are mapped onto description vectors x(fl, dm), and a polynomial function e(x) is developed such that it yields estimates of the probability of relevance P(R | x (fl, dm) with minimum square errors. We give experimental results for the application of this approach to documents with weighted indexing as well as to documents with complex representations. In contrast to other probabilistic models, our approach yields estimates of the actual probabilities, it can handle very complex representations of documents and requests, and it can be easily applied to multivalued relevance scales. On the other hand, this approach is not suited to log-linear probabilistic models and it needs large samples of relevance feedback data for its application.
- 1 BIEBRICHER, P., FUHR, N., KNORZ, G., LUSTIG, G., AND SCHWANTNER, M. The automatic indexing system AIR/PHYS--from research to application. In 11th International Conference on Research and Development in Information Retrieval, Y. Chiaramella, Ed. Presses Universitaires de Grenoble, Grenoble, France, 1988, pp. 333-342. Google Scholar
- 2 BOLLMANN, P., JOCHUM, R., REINER, U., WEISSMANN, V., AND ZUSE, H. Planung und Durchfiihrung der Retrievaltests. In Leistungsbewertung yon Information Retrieval Verfahren (LIVE), H.-J. Schneider et al., eds. TU Berlin, Fachbereich Informatik, Computergestfitzte Informationssysteme (CIS), Institut fiir Angewandte Informatik, 1986, pp. 183-212.Google Scholar
- 3 BOOKSTEIN, A. Logtinear Analysis of Library Data. Research Report, OCLC, Office of Research, 1988.Google Scholar
- 4 BOOKSTEIN, A. Outline of a general probabilistic retrieval model. J. Doc. 39, 2 (1983), 63-72.Google Scholar
- 5 CROFT, W.B. Approaches to intelligent information retrieval. Inf. Process. Manage. 23, 4 (1987), 249-254. Google Scholar
- 6 DUI)A, R. O., AND HART, P.E. Pattern Classification and Scene Analysis. Wiley, New York, 1973.Google Scholar
- 7 FUHR, N. Models for retrieval with probabilistic indexing. In/. Process. Manage. 25, 1 (1989), 55-72. Google Scholar
- 8 FUnR, N. A probabilistic model of dictionary based automatic indexing. In Proceedings of the Riao 85 (Recherche d'informations Assistee par Ordinateur) (Grenoble, France, March 18-20). 1985, pp. 207-216.Google Scholar
- 9 FUHR, N. Probabilistisches lndexing und Retrieval. Fachinformationszentrum Karlsruhe, Eggenstein-Leopoldshafen, West Germany, 1988.Google Scholar
- 10 FUHR, N. Two models of retrieval with probabilistic indexing. In Proceedings of the 9th Annual Conference on Research and Development in Information Retrieval (Pisa, Italy, Sept. 8-10). F. Rabitti, ed. ACM, New York, 1986, pp. 249-257. Google Scholar
- 11 FUHR, N., AND HUTHER, H. Optimum probability estimation based on expectations, in llth International Conference on Research and Development in Information Retrieval, Y. Chiaranella, ed. Presses Universitaires de Grenoble, Grenoble, France, 1988, pp. 257-273. Google Scholar
- 12 FUHR, N., AND HOTHER, H. Optimum probability estimation from empirical distributions. Inf. Process. Manage. 25, 3 (1989). Google Scholar
- 13 FUHR, N., AND KNORZ, G. Retrieval test evaluation of a rule based automatic indexing (AIR/ PHYS). In Research and Development in Information Retrieval, C. J. Van Rijsbergen, ed. Cambridge University Press, Cambridge, England 1984, pp. 391-408. Google Scholar
- 14 GORDON, M. Probabilistic and genetic algorithms for document retrieval. Commun. ACM 31, 10 (Oct. 1988), 1208-1218. Google Scholar
- 15 KEEN, E. M. Evaluation parameters. In The SMART Retrieval System--Experiments in Automatic Document Processing, G. Salton, ed. Prentice Hall, Englewood Cliffs, N.J., 1971, pp. 74-112.Google Scholar
- 16 KNORZ, G. Automatisches Indexieren als Erkennen abstrakter Objekte. Niemeyer, Tfibingen, West Germany, 1983.Google Scholar
- 17 KNORZ, G. A decision theory approach to optimal automatic indexing. In Research and Development in Information Retrieval, G. Salton and H.-J. Schneider, eds. Springer, Berlin, West Germany, 1983, pp. 174-193. Google Scholar
- 18 KONSTANTIN, J. Untersuchung yon nach dem Quadratmittel-Polynomansatz erstellten Ranking{unktionen. Diplomarbeit, TH Darmstadt, FB Informatik, Datenverwaltungssysteme II, Darmstadt, West Germany 1985.Google Scholar
- 19 LUSTlCL G. Automatische Indexierung zwischen Forschung und Anwendung. Olms, Hildesheim, West Germany 1986.Google Scholar
- 20 RIJSBERGEN, C.J. Information Retrieval, 2nd ed. Butterworth, London, 1979. Google Scholar
- 21 ROBERTSON, S.E. The probability ranking principle in IR. J. Doc. 33 (1977), 294-304.Google Scholar
- 22 ROBERTSON, S. E., MARON, M. E., AND COOPER, W.S. Probability of relevance: A unification of two competing models for document retrieval. Inf. Tech. Res. 1 (1982), 1-21.Google Scholar
- 23 ROCCHIO, J.J. Relevance feedback in information retrieval. In The SMART Retrieval System~ Experiments in Automatic Document Processing, G. Salton, ed. Prentice Hall, Englewood Cliffs, N.J., 1971.Google Scholar
- 24 SALTON, G., EI). The SMART Retrieval System--Experiments in Automatic Document Processing. Prentice Hall, Englewood Cliffs, N.J., 1971. Google Scholar
- 25 SCHORMANN, J. Polynomklassifikatoren fur die Zeichenerkennung. Ansatz, Adaption, Anwendung. Oldenbourg, M/inchen, West Germany, 1977.Google Scholar
- 26 WoNc,, S. K. M., YAO, Y. Y., AND BOLLMANN, P. Linear structure in information retrieval. In l lth International Conference on Research and Development in Information Retrieval, Y. Chiaramella, ed. Presses Universitaires de Grenoble, Grenoble, France, June 1988, pp. 219-232. Google Scholar
Index Terms
- Optimum polynomial retrieval functions based on the probability ranking principle
Recommendations
Optimum polynomial retrieval functions
SIGIR '89: Proceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrievalWe show that any approach to develop optimum retrieval functions is based on two kinds of assumptions: first, a certain form of representation for documents and requests, and second, additional simplifying assumptions that predefine the type of the ...
Optimum polynomial retrieval functions
Special issue: Proceedings of the 12th annual international ACMSIGIR conference on Research and development in information retrieval, N.J. Belkin and C.J. van Rijsbergen (Eds.), June 25-28, 1989, Cambridge, MA.We show that any approach to develop optimum retrieval functions is based on two kinds of assumptions: first, a certain form of representation for documents and requests, and second, additional simplifying assumptions that predefine the type of the ...
The Probability Ranking Principle is Not Optimal in Adversarial Retrieval Settings
ICTIR '15: Proceedings of the 2015 International Conference on The Theory of Information RetrievalThe probability ranking principle (PRP) - ranking documents in response to a query by their relevance probabilities - is the theoretical foundation of most ad hoc document retrieval methods. A key observation that motivates our work is that the PRP does ...
Comments