Article

Incremental maintenance of quotient cube for median

Authors:
Cuiping Li

Renmin University of China, Beijing, China

Renmin University of China, Beijing, China
View Profile

,
Gao Cong

Natl University of Singapore, S'pore, Singapore

Natl University of Singapore, S'pore, Singapore
View Profile

,
Anthony K. H. Tung

Natl University of Singapore, S'pore, Singapore

Natl University of Singapore, S'pore, Singapore
View Profile

,
Shan Wang

Renmin University of China, Beijing, China

Renmin University of China, Beijing, China
View Profile

KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2004Pages 226–235https://doi.org/10.1145/1014052.1014079

Published:22 August 2004Publication History

KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 226–235

ABSTRACT

Data cube pre-computation is an important concept for supporting OLAP(Online Analytical Processing) and has been studied extensively. It is often not feasible to compute a complete data cube due to the huge storage requirement. Recently proposed quotient cube addressed this issue through a partitioning method that groups cube cells into equivalence partitions. Such an approach is not only useful for distributive aggregate functions such as SUM but can also be applied to the holistic aggregate functions like MEDIAN.Maintaining a data cube for holistic aggregation is a hard problem since its difficulty lies in the fact that history tuple values must be kept in order to compute the new aggregate when tuples are inserted or deleted. The quotient cube makes the problem harder since we also need to maintain the equivalence classes. In this paper, we introduce two techniques called addset data structure and sliding window to deal with this problem. We develop efficient algorithms for maintaining a quotient cube with holistic aggregation functions that takes up reasonably small storage space. Performance study shows that our algorithms are effective, efficient and scalable over large databases.

References

S. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. In Proc. 1996 Int. Conf. Very Large Data Bases (VLDB'96), pages 506--521, Bombay, India, Sept. 1996. Google ScholarDigital Library
D. Barbara. Quasi-cubes: Exploiting approximation in multidimensional databases. SIGMOD Record, 26:12--17, 1997. Google ScholarDigital Library
K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. In Proc. 1999 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'99), pages 359--370, Philadelphia, PA, June 1999. Google ScholarDigital Library
A. Davey and H. A. Priestley. Introduction to Lattices and Order. Cambridge University Press, 1990.Google Scholar
B. Ganter, R. Wille, and C. Franzke. Formal concept analysis: mathematical foundations. Springer-verlag, 1999. Google ScholarDigital Library
R. Godin, R. Missaoui, and H. Alaoui. Incremental concept formation algorithms based on galois lattices. Computational Intelligence, 11:246--267, 1991.Google ScholarCross Ref
J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab and sub-totals. Data Mining and Knowledge Discovery, 1:29--54, 1997. Google ScholarDigital Library
V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'96), pages 205--216, Montreal, Canada, June 1996. Google ScholarDigital Library
W. Labio, U. Yang, Y. Cui, H. Garcia-Molina, and J. Widom. Performance issues in incremental warehouse maintenance. In Proc. of the 26st Int'l Conference on Very Large Databases (VLDB'00), 2000. Google ScholarDigital Library
L. Lakshmanan, J. Pei, and J. Han. Quotient cube: How to summarize the semantics of a data cube. In Proc. 2002 Int. Conf. Very Large Data Bases (VLDB'02), 2002. Google ScholarDigital Library
L. Lakshmanan, J. Pei, and Y. Zhao. Qc-trees: An efficient summary structure for semantic olap. In Proc. Of ACM-SIGMOD Int'l Conference on Management of Data, 2003. Google ScholarDigital Library
I. Mumick, D. Quass, and B. Mumick. Maintaince of data cubes and summary tables in a warehouse. In Proc. Of ACM-SIGMOD Int'l Conference on Management of Data, 1997. Google ScholarDigital Library
K. Ross and D. Srivastava. Fast computation of sparse datacubes. In Proc. 1997 Int. Conf. Very Large Data Bases (VLDB'97), pages 116--125, Athens, Greece, Aug. 1997. Google ScholarDigital Library
K. Ross, D. Srivastava, and S. Sudarshan. Materialized view maintenance and integrity constraint checking:trading space for time. In Proc. Of ACM-SIGMOD Int'l Conference on Management of Data, 1996. Google ScholarDigital Library
J. Shanmugasundaram, U. Fayyad, and P. Bradley. Compressed data cubes for olap aggregate query approximation on continuous dimensions. In Proc. Of ACM-SIGKDD Int'l Conference on Knowledge Discovery and Data Mining, 1999. Google ScholarDigital Library
A. Shukla, P. Deshpande, and J. F. Naughton. Materialized view selection for multidimensional datasets. In Proc. 1998 Int. Conf. Very Large Data Bases (VLDB'98), pages 488--499, New York, NY, Aug. 1998. Google ScholarDigital Library
Y. Sismanis, A. Deligiannakis, N. Roussopoulos, and Y. Kotidis. Dwarf: Shrinking the petacube. In Proc. Of ACM-SIGMOD Int'l Conference on Management of Data, 2002. Google ScholarDigital Library
W. Wang, J. Feng, H. Lu, and J. Yu. Condensed cube: An effective approach to reducing data cube size. In Proc. of 2002 Int'l Conf. on Data Engineering (ICDE'02), 2002. Google ScholarDigital Library
Y. Zhao, P. M. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'97), pages 159--170, Tucson, Arizona, May 1997. Google ScholarDigital Library

Index Terms

Incremental maintenance of quotient cube for median
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Semi-closed cube: an effective approach to trading off data cube size and query response time

The results of data cube will occupy huge amount of disk space when the base table is of a large number of attributes. A new type of data cube, compact data cube like condensed cube and quotient cube, was proposed to solve the problem. It compresses ...
Read More
Incremental maintenance of quotient cube based on galois lattice
Abstract
Data cube computation is a well-known expensive operation and has been studied extensively. It is often not feasible to compute a complete data cube due to the huge storage requirement. Recently proposed quotient cube addressed this fundamental ...
Read More
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data analysis applications typically aggregate data across many dimensions looking for anomalies or unusual patterns. The SQL aggregate functions and the GROUP BY operator produce zero-dimensional or one-dimensional aggregates. Applications need the N-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
August 2004
874 pages
ISBN:1581138881
DOI:10.1145/1014052
General Chairs:
Won Kim
Cyber Database Solutions
,
Ronny Kohavi
Amazon.com
,
Program Chairs:
Johannes Gehrke
Cornell University
,
William DuMouchel
AT&T Labs Research
Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 August 2004
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data cube
holistic aggregation
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 14
  Total Citations
  View Citations
- 780
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Incremental maintenance of quotient cube for median

KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Semi-closed cube: an effective approach to trading off data cube size and query response time

Incremental maintenance of quotient cube based on galois lattice

Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals