ABSTRACT
Present database systems process all the data related to a query before giving out responses. As a result, the size of the data to be processed becomes excessive for real-time/time-constrained environments. A new methodology is needed to cut down systematically the time to process the data involved in processing the query. To this end, we propose to use data samples and construct an approximate synthetic response to a given query.
In this paper, we consider only COUNT(E) type queries, where E is an arbitrary relational algebra expression. We make no assumptions about the distribution of attribute values and ordering of tuples in the input relations, and propose consistent and unbiased estimators for arbitrary COUNT(E) type queries. We design a sampling plan based on the cluster sampling method to improve the utilization of sampled data and to reduce the cost of sampling. We also evaluate the performance of the proposed estimators.
- Coch 77.Cochmn. w o, "~g T~que,". T~rd ~d John Wdey & Sons, 1977Google Scholar
- Chri 83.Chnstodoulakas, S. "Estunaung Record SelecuvtJes", Informaucm Systems, Vol 8, 1983Google Scholar
- Devo 84.Devote, J L, "Probabthty & Staumcs for Eng, neermg and Sclences", Brook/Cole, 1984Google Scholar
- DFHO 86.Datta, A, Fourmer, B. Hc~. W-C, and Ozsoyoglu, G, "The Implementat~n of SSDB". Proc Thml Internauonal Workshop oa Stausucal Databam Management", July 1986.Google Scholar
- Good 49.Goodman, LA., "On the Esamauon of the N~ of Classes m a Populaum", Ann Math Sta., 1949Google Scholar
- HoOT 87.Hou, W-C, Ozsoyoglu, G, and Taneja, B k., "Sta- U~cal Emmators for RelalLtonal Algebra Expres. mons", Tech Rpt. CES-87-15, CWRU, 1987Google Scholar
- Liu 68.Lm, C L, "Intmdu~m to Ccxnbmatonal Mathematics", McGraw-Hall, 196&Google Scholar
- Morg 81.Morgenstem, J P, "Compmer Based Management lnfommtton Systems Embodymg Answer Accuracy As a User Parameter", Ph D Thems, Umv of Cahfornm, Berkeley, 1981 Google ScholarDigital Library
- Olke 86.Olken, F, "Phymcal Database Support for Sc~enufic and Statlmcal Databases", Third Int. Scsenttfic and Statlmcal Databases Workshop, 1986 Google ScholarDigital Library
- OlkR 86.Olken, F and Rotem, D, "Ssmple Random Samphng from Relational Databases", VLDB Conf 1986 Google ScholarDigital Library
- Ross 80.Ross, S M, "lntroductloa to Probabshty Models", 2nd Ed., Acndenac Press, 1980Google Scholar
- Rowe 85.Rowe, N C, "Antlsamphn8 for Emmattom An overvsew", IEE Trans on Software Eng., Oct 1985 Google ScholarDigital Library
- SMO 86.Scheaffer, MendenhaH, and Ott, "Elementary Survey Samphn8", 3rd Ed, Duxbury press, 1986Google Scholar
- Sukh 84.Sukhatme, P V, etc , "Samphng ~ry of Surveys Apphcauon", 3rd Ed, New Delhi, Indm and Iowa State Umv Press, 1984Google Scholar
Recommendations
Induction of relational algebra expressions
ILP'09: Proceedings of the 19th international conference on Inductive logic programmingWe study the problemof inducing relational databases queries expressing quantification. Such queries express interesting multi-relational patterns in a database in a concise manner. Queries on relational databases can be expressed as Datalog programs. ...
Translating SQL Into Relational Algebra: Optimization, Semantics, and Equivalence of SQL Queries
In this paper, we present a translator from a relevant subset of SQL into relational algebra. The translation is syntax-directed, with translation rules associated with grammar productions; each production corresponds to a particular type of SQL ...
Comments