ABSTRACT
Many modern applications, including online financial feeds, tag-based mass transit systems and RFID-based supply chain management systems transmit real-time data streams. There is a need for event stream processing technology to analyze this vast amount of sequential data to enable online operational decision making. Existing techniques such as traditional online analytical processing (OLAP) systems are not designed for real-time pattern-based operations, while state-of-the-art Complex Event Processing (CEP) systems designed for sequence detection do not support OLAP operations. We propose a novel E-Cube model which combines CEP and OLAP techniques for efficient multi-dimensional event pattern analysis at different abstraction levels. Our analysis of the interrelationships in both concept abstraction and pattern refinement among queries facilitates the composition of these queries into an integrated E-Cube hierarchy. Based on this E-Cube hierarchy, strategies of drill-down (refinement from abstract to more specific patterns) and of roll-up (generalization from specific to more abstract patterns) are developed for the efficient workload evaluation. Our proposed execution strategies reuse intermediate results along both the concept and the pattern refinement relationships between queries. Based on this foundation, we design a cost-driven adaptive optimizer called Chase, that exploits the above reuse strategies for optimal E-Cube hierarchy execution. Our experimental studies comparing alternate strategies on a real world financial data stream under different workload conditions demonstrate the superiority of the Chase method. In particular, our Chase execution in many cases performs ten fold faster than the state-of-the art strategy for real stock market query workloads.
- I. inetats. stock trade traces. http://www.inetats.com/.Google Scholar
- K. S. Candan,W.-P. Hsiung, S. Chen, J. Tatemura, and D. Agrawal. AFilter: Adaptable XML filtering with prefix-caching and suffix-clustering. In VLDB, pages 559--570, 2006. Google ScholarDigital Library
- S. Chakravarthy, V. Krishnaprasad, E. Anwar, and S.-K. Kim. Composite events for active databases: Semantics, contexts and detection. In VLDB, pages 606--617, 1994. Google ScholarDigital Library
- S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAP technology. SIGMOD Record, 26(1):65--74, 1997. Google ScholarDigital Library
- A. J. Demers, J. Gehrke, B. Panda, M. Riedewald, V. Sharma, and W. M.White. Cayuga: A general purpose event monitoring system. In CIDR, pages 412--422, 2007.Google Scholar
- J. Edmonds. Optimum branchings. In J. Research of the National Bureau of Standards, pages 233--240., 1967.Google ScholarCross Ref
- S. Finkelstein. Common expression analysis in database applications. In SIGMOD, 1982. Google ScholarDigital Library
- H. N. Gabow, Z. Galil, T. H. Spencer, and R. E. Tarjan. Efficient algorithms for finding minimum spanning trees in undirected and directed graphs. Combinatorica, 6(2):109--122, 1986. Google ScholarDigital Library
- B. Gedik, K.-L. Wu, P. S. Yu, and L. Liu. Adaptive load shedding for windowed stream joins. In CIKM, pages 171--178, 2005. Google ScholarDigital Library
- H. Gonzalez, J. Han, and X. Li. Flowcube: Constructuing RFID FlowCubes for multi-dimensional analysis of commodity flows. In VLDB, pages 834--845, 2006. Google ScholarDigital Library
- A. Gupta, V. Harinarayan, and D. Quass. Aggregate-query processing in data warehousing environments. In VLDB, pages 358--369, 1995. Google ScholarDigital Library
- C. Gupta, S. Wang, I. Ari, M. C. Hao, U. Dayal, A. Mehta, M. Marwah, and R. K. Sharma. Chaos: A data stream analysis architecture for enterprise applications. In CEC, pages 33--40, 2009. Google ScholarDigital Library
- J. Han, Y. Cai, and N. Cercone. Knowledge discovery in databases: An attribute-oriented approach. In VLDB, pages 547--559, 1992. Google ScholarDigital Library
- J. Han, Y. Chen, G. Dong, J. Pei, B. W. Wah, J.Wang, and Y. D. Cai. Stream Cube: An architecture for multi-dimensional analysis of data streams. Distributed and Parallel Databases, 18(2):173--197, 2005. Google ScholarDigital Library
- V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In SIGMOD, pages 205--216, 1996. Google ScholarDigital Library
- M. Hong, M. Riedewald, C. Koch, J. Gehrke, and A. J. Demers. Rule-based multi-query optimization. In EDBT, pages 120--131, 2009. Google ScholarDigital Library
- S. Krishnamurthy, C.Wu, and M. J. Franklin. On-the-fly sharing for streamed aggregation. In SIGMOD, pages 623--634, 2006. Google ScholarDigital Library
- J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker. Semantics and evaluation techniques for window aggregates in data streams. In SIGMOD, pages 311--322, 2005. Google ScholarDigital Library
- B. Liu, Y. Zhu, and E. A. Rundensteiner. Run-time operator state spilling for memory intensive long-running queries. In SIGMOD, pages 347--358, 2006. Google ScholarDigital Library
- M. Liu, M. Li, D. Golovnya, E. A. Rundensteiner, and K. T. Claypool. Sequence pattern query processing over out-of-order event streams. In ICDE, pages 784--795, 2009. Google ScholarDigital Library
- M. Liu, E. Rundensteiner, K. Greenfield, C. Gupta, S. Wang, I. Ari, and A. Mehta. E-cube: Multi-dimensional event sequence processing using concept and pattern hierarchies. In ICDE, pages 1097--1100, 2010.Google ScholarCross Ref
- M. Liu, E. Rundensteiner, K. Greenfield, C. Gupta, S. Wang, I. Ari, and A. Mehta. High-performance nested cep query processing over event streams. In ICDE, 2011. Google ScholarDigital Library
- M. Liu, E. Rundensteiner, K. Greenfield, C. Gupta, S. Wang, I. Ari, and A. Mehta. E-cube: Multi-dimensional event sequence processing using concept and pattern hierarchies. Technical Report WPI-CS-TR-09-08.Google Scholar
- E. Lo, B. Kao, W.-S. Ho, S. D. Lee, C. K. Chui, and D. W. Cheung. OLAP on sequence data. In SIGMOD Conference, pages 649--660, 2008. Google ScholarDigital Library
- Y. Mei and S. Madden. Zstream: a cost-based query processor for adaptively detecting composite events. In SIGMOD, pages 193--206, 2009. Google ScholarDigital Library
- P. Roy, S. Seshadri, S. Sudarshan, and S. Bhobe. Efficient and extensible algorithms for multi query optimization. In SIGMOD, pages 249--260, 2000. Google ScholarDigital Library
- T. K. Sellis. Multiple-query optimization. ACM Trans. Database Syst., 13(1):23--52, 1988. Google ScholarDigital Library
- S. Wang, E. A. Rundensteiner, S. Ganguly, and S. Bhatnagar. State-slice: New paradigm of multi-query optimization of window-based stream queries. In VLDB, pages 619--630, 2006. Google ScholarDigital Library
- E. Wu, Y. Diao, and S. Rizvi. High-performance complex event processing over streams. In SIGMOD, pages 407--418, 2006. Google ScholarDigital Library
- Y. Zhu, E. A. Rundensteiner, and G. T. Heineman. Dynamic plan migration for continuous queries over data streams. In SIGMOD, pages 431--442, 2004. Google ScholarDigital Library
Index Terms
- E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing
Recommendations
ZStream: a cost-based query processor for adaptively detecting composite events
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of dataComposite (or Complex) event processing (CEP) systems search sequences of incoming events for occurrences of user-specified event patterns. Recently, they have gained more attention in a variety of areas due to their powerful and expressive query ...
Optimization in Data Cube System Design
The design of an OLAP system for supporting real-time queries is one of the major research issues. One approach is to use data cubes, which are materialized precomputed multidimensional views of data in a data warehouse. We can derive a set of data ...
Approximate Query Processing in Cube Streams
Data cubes have become important components in most data warehouse systems and Decision-Support-Systems. In such systems, users usually pose very complex queries to the Online Analytical Processing (OLAP) system, and systems usually have to deal with a ...
Comments