Abstract
Secondary indexes are often used in database management systems for secondary key retrieval. Although their use can improve retrieval time significantly, the cost of index maintenance and storage increases the overhead of the file processing application. The optimal set of indexed secondary keys for a particular application depends on a number of application dependent factors. In this paper a cost function is developed for the evaluation of candidate indexing choices and applied to the optimization of index selection. Factors accounted for include file size, the relative rates of retrieval and maintenance and the distribution of retrieval and maintenance over the candidate keys, index structure, and system charging rates. Among the results demonstrated are the increased effectiveness of secondary indexes for large files, the effect of the relative rates of retrieval and maintenance, the greater cost of allowing for arbitrarily formulated queries, and the impact on cost of the use of different index structures.
- 1 ANDERSON, H.D. Optimal selection of secondary indexes in data base management systems. Ph.D. Th., SIS Dept., Syracuse U., Syracuse, N.Y., Aug. 1973. Google ScholarDigital Library
- 2 BOOKMAN, P.J. Make your users pay the price. Computer Decisions 4 (Sept. 1972), 28-31.Google Scholar
- 3 C~.RDENAS, A.F. Analysis and performance of inverted data base structures. Comm. ACM 18, 5 (May 1975), 253-263. Google ScholarDigital Library
- 4 CODASYL SYSTEMS COMMITTEE. A survey of generalized data base management systems. ACM, New York, May 1969.Google Scholar
- 5 CODASYL SYSTEMS COMMITTEE. Feature analysis of generalized data base management systems. ACM, New York, 1971. Google ScholarDigital Library
- 6 DELOBEL, C. Determination of an optimal set of secondary keys for formatted files. ONLINE 72, Int. Symp. and Exhib. of Online Interactive Comptng, Brunel U., Uxbridge, England, Sept. 1972.Google Scholar
- 7 IBM System/370 Model 155 Functional Characteristics. GA22-6942-1, IBM Corp., White Plains, N.Y., 1971.Google Scholar
- 8 JONES, W.J. Syracuse University Computing Center charges. Document C64-0495, Syracuse U. Comptng. Ctr. Inform. SET., Syracuse U., Syracuse, N.Y., Feb. 1972.Google Scholar
- 9 KING~ W.F. On the Selection of indices for a file. IBM Res. Rep. RJ 1341, IBM Res. Lab., San Jose, Calif., Jan. 1974.Google Scholar
- 10 KNUTtt, D.E. The Art of Computer Programming, Vol. 3: Searching and Sorting. Addison Wesley, Reading, Mass., 1973.Google Scholar
- 11 LowE, T.C. The influence of data base characteristics and usage on direct access file organization. J. ACM 15, 4 (Oct. 1968), 535-548. Google ScholarDigital Library
- 12 Lug, V.Y., AND LxNO, H. An optimization problem on the selection of secondary keys. Proc. ACM 1971 Ann. Conf., pp. 349-356. Google ScholarDigital Library
- 13 LvM, V.Y. Multi-attribute retrieval with combined indexes. Comm. ACM lS, 11 (Nov. 1970), 660-665. Google ScholarDigital Library
- 14 STONEBRAKER, M.R. Retrieval efficiency using combined indices. Proc. 1972 ACM SIG- FIDET Workshop on Data Description, Access, and Control, Denver, Colo., pp. 243-256. Google ScholarDigital Library
- 15 WEBB, D.A. Evaluation of hash coding systems. Ph.D. Th., SIS Dept., Syracuse U., Syracuse, N.Y., Aug. 1972.Google Scholar
Index Terms
- Minimum cost selection of secondary indexes for formatted files
Recommendations
An optimization problem on the selection of secondary keys
ACM '71: Proceedings of the 1971 26th annual conferenceThe use of index files for accessing records, on the basis of secondary key values is a common feature of existing information systems. This method, judiciously applied, provides major improvements in response time, is easy to implement and costs ...
Multi-attribute retrieval with combined indexes
In this paper a file organization scheme designed to replace the use of the popular secondary index filing scheme (or inverted files on secondary key fields) is described. Through the use of redundancy and storing keys (or access numbers of the records) ...
On-line index maintenance using horizontal partitioning
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementIn this paper, we propose a new merge-based index maintenance strategy for Information Retrieval systems. The new model is based on partitioning of the inverted index across the terms in it. We exploit the query log to partition the on-disk inverted ...
Comments