ABSTRACT
Partitioning by clustering of very large databases is a necessity to reduce the space/time complexity of retrieval operations. However, the contemporary and modern retrieval environments demand dynamic maintenance of clusters. A new cluster maintenance strategy is proposed and its similarity/stability characteristics, cost analysis, and retrieval behavior in comparison with unclustered and completely reclustered database environments have been examined by means of a series of experiments.
- 1.Anderberg, M.R. Cluster Analysis for Applications. New York: Academic Press; 1973.Google Scholar
- 2.Can, F., Ozkarahan, E.A. A C2ustering Scheme. Proc. of the ACM SIGIR Conf. Bethesda, MD pp. 115-121: 1983. Google ScholarDigital Library
- 3.Can, F., 0zkarahan, E.A. Two Partitioning ~e C2ustering Algorithms. Journal of the erican Society for Information Science. Vol.35, No.5, pp.268-276; 1984.Google Scholar
- 4.Can, F. A New CZustering Scheme for Infovma: tion Retrieval Systems Incorporating the Support of a Database Machine. Ph.D. dissertation, Department of Computer Engineering, Middle East Technical University; Ankara, January 1985.Google Scholar
- 5.Can, F., 0zkarahan, E.A. Concepts of the Cover Coefficient-Based CZustering Methodology. Proc. of the ACM SIGIR Conf.; Montreal, pp.204-211; 1985. Google ScholarDigital Library
- 6.Crouch, D.B., A File Organization and Maintenance Procedure for Dynamic Document Collec tions. Ii~orlTlIation Processing and Management. Vol. 11, pp. 11-21; 1975.Google ScholarCross Ref
- 7.Deogun, J.S., Raghavanl V.V. User-oriented Document 52ustering.'A Framework for Learning in Inf ormzLtion RetT~evat.Proc. of the ACM SIGIR Conf.; Pisa, pp.1.57-163; 1986. Google ScholarDigital Library
- 8.Kutluay, S. Validity Analysis of the Cover Coefficient Concept on 52ustering. Msc Thesis, Dept. of Electrical and Electronic Eng., Middle East Technical University; Ankara, 1986.Google Scholar
- 9.Milligan, G.W., Soon, S.C., Sokol, L.M. The Effect of Cluster Size, D~mensionality, and Number of ~2usters on Recovery of True Cluster Structure. ~ Transactions on Pattern Analysis and Machine Intelligence. Vol.-PAMI-5 No. 1; 1983.Google Scholar
- 10.0zkarahan, E.A., Can, F. An Integrated Fact/Document Information System for Office Automation, Information Technology: Research and Development. Vo. 3, No.3, pp. 142-156; 1984. Google ScholarDigital Library
- 11.0zkarahan, E. Database Machines and Database Management. Englewood Cliffs, New Jersey: Prentice-Hall; 1986. Google ScholarDigital Library
- 12.Ozkarahan, E.A., Can, F. An Automatic and Tunable Document Indexing System.Proc. of ACIVI SIGIR Conf.; Pisa, pp.234-243; 1986. Google ScholarDigital Library
- 13.Salton, G. Dynamic Information and Library/Processing. Englewood Cliffs, New Jersey: Prentice Hall; 1975. Google ScholarDigital Library
- 14.Salton, G., Wong, A. Generation and Search of 52ustered Files. ACM Transactions on Database Systems. Vol.3, No.4, pp.321-846; 1978. Google ScholarDigital Library
- 15.Salton, G., McGill, M.J. Introduction to Modern information Retrieval. New York: McGraw Hill, 1983. Google ScholarDigital Library
- 16.Ural, M.H. Performanze Evaluation of the Cover Coefficient Based ~2ustering and C~uster Maintenance Methodology in Information Retrieval. Msc Thesis, Dept. of Electrical and Electronic Eng., Middle East Technical University; Ankara, 1986.Google Scholar
- 17.Van Rijsbergen, C.J. Information Retrieval. 2nd ed. London: Butterworth Scientific Publishers; 1979. Google ScholarDigital Library
Index Terms
- A dynamic cluster maintenance system for information retrieval
Recommendations
Testing the cluster hypothesis in distributed information retrieval
How to merge and organise query results retrieved from different resources is one of the key issues in distributed information retrieval. Some previous research and experiments suggest that cluster-based document browsing is more effective than a single ...
Cluster-based information retrieval using pattern mining
AbstractThis paper addresses the problem of responding to user queries by fetching the most relevant object from a clustered set of objects. It addresses the common drawbacks of cluster-based approaches and targets fast, high-quality information ...
Comments