skip to main content
10.1145/1989323.1989416acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing

Published:12 June 2011Publication History

ABSTRACT

Many modern applications, including online financial feeds, tag-based mass transit systems and RFID-based supply chain management systems transmit real-time data streams. There is a need for event stream processing technology to analyze this vast amount of sequential data to enable online operational decision making. Existing techniques such as traditional online analytical processing (OLAP) systems are not designed for real-time pattern-based operations, while state-of-the-art Complex Event Processing (CEP) systems designed for sequence detection do not support OLAP operations. We propose a novel E-Cube model which combines CEP and OLAP techniques for efficient multi-dimensional event pattern analysis at different abstraction levels. Our analysis of the interrelationships in both concept abstraction and pattern refinement among queries facilitates the composition of these queries into an integrated E-Cube hierarchy. Based on this E-Cube hierarchy, strategies of drill-down (refinement from abstract to more specific patterns) and of roll-up (generalization from specific to more abstract patterns) are developed for the efficient workload evaluation. Our proposed execution strategies reuse intermediate results along both the concept and the pattern refinement relationships between queries. Based on this foundation, we design a cost-driven adaptive optimizer called Chase, that exploits the above reuse strategies for optimal E-Cube hierarchy execution. Our experimental studies comparing alternate strategies on a real world financial data stream under different workload conditions demonstrate the superiority of the Chase method. In particular, our Chase execution in many cases performs ten fold faster than the state-of-the art strategy for real stock market query workloads.

References

  1. I. inetats. stock trade traces. http://www.inetats.com/.Google ScholarGoogle Scholar
  2. K. S. Candan,W.-P. Hsiung, S. Chen, J. Tatemura, and D. Agrawal. AFilter: Adaptable XML filtering with prefix-caching and suffix-clustering. In VLDB, pages 559--570, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Chakravarthy, V. Krishnaprasad, E. Anwar, and S.-K. Kim. Composite events for active databases: Semantics, contexts and detection. In VLDB, pages 606--617, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAP technology. SIGMOD Record, 26(1):65--74, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. J. Demers, J. Gehrke, B. Panda, M. Riedewald, V. Sharma, and W. M.White. Cayuga: A general purpose event monitoring system. In CIDR, pages 412--422, 2007.Google ScholarGoogle Scholar
  6. J. Edmonds. Optimum branchings. In J. Research of the National Bureau of Standards, pages 233--240., 1967.Google ScholarGoogle ScholarCross RefCross Ref
  7. S. Finkelstein. Common expression analysis in database applications. In SIGMOD, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. H. N. Gabow, Z. Galil, T. H. Spencer, and R. E. Tarjan. Efficient algorithms for finding minimum spanning trees in undirected and directed graphs. Combinatorica, 6(2):109--122, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. Gedik, K.-L. Wu, P. S. Yu, and L. Liu. Adaptive load shedding for windowed stream joins. In CIKM, pages 171--178, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Gonzalez, J. Han, and X. Li. Flowcube: Constructuing RFID FlowCubes for multi-dimensional analysis of commodity flows. In VLDB, pages 834--845, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Gupta, V. Harinarayan, and D. Quass. Aggregate-query processing in data warehousing environments. In VLDB, pages 358--369, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. C. Gupta, S. Wang, I. Ari, M. C. Hao, U. Dayal, A. Mehta, M. Marwah, and R. K. Sharma. Chaos: A data stream analysis architecture for enterprise applications. In CEC, pages 33--40, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Han, Y. Cai, and N. Cercone. Knowledge discovery in databases: An attribute-oriented approach. In VLDB, pages 547--559, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Han, Y. Chen, G. Dong, J. Pei, B. W. Wah, J.Wang, and Y. D. Cai. Stream Cube: An architecture for multi-dimensional analysis of data streams. Distributed and Parallel Databases, 18(2):173--197, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In SIGMOD, pages 205--216, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Hong, M. Riedewald, C. Koch, J. Gehrke, and A. J. Demers. Rule-based multi-query optimization. In EDBT, pages 120--131, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Krishnamurthy, C.Wu, and M. J. Franklin. On-the-fly sharing for streamed aggregation. In SIGMOD, pages 623--634, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker. Semantics and evaluation techniques for window aggregates in data streams. In SIGMOD, pages 311--322, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. Liu, Y. Zhu, and E. A. Rundensteiner. Run-time operator state spilling for memory intensive long-running queries. In SIGMOD, pages 347--358, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Liu, M. Li, D. Golovnya, E. A. Rundensteiner, and K. T. Claypool. Sequence pattern query processing over out-of-order event streams. In ICDE, pages 784--795, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Liu, E. Rundensteiner, K. Greenfield, C. Gupta, S. Wang, I. Ari, and A. Mehta. E-cube: Multi-dimensional event sequence processing using concept and pattern hierarchies. In ICDE, pages 1097--1100, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  22. M. Liu, E. Rundensteiner, K. Greenfield, C. Gupta, S. Wang, I. Ari, and A. Mehta. High-performance nested cep query processing over event streams. In ICDE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Liu, E. Rundensteiner, K. Greenfield, C. Gupta, S. Wang, I. Ari, and A. Mehta. E-cube: Multi-dimensional event sequence processing using concept and pattern hierarchies. Technical Report WPI-CS-TR-09-08.Google ScholarGoogle Scholar
  24. E. Lo, B. Kao, W.-S. Ho, S. D. Lee, C. K. Chui, and D. W. Cheung. OLAP on sequence data. In SIGMOD Conference, pages 649--660, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Mei and S. Madden. Zstream: a cost-based query processor for adaptively detecting composite events. In SIGMOD, pages 193--206, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. Roy, S. Seshadri, S. Sudarshan, and S. Bhobe. Efficient and extensible algorithms for multi query optimization. In SIGMOD, pages 249--260, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. T. K. Sellis. Multiple-query optimization. ACM Trans. Database Syst., 13(1):23--52, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Wang, E. A. Rundensteiner, S. Ganguly, and S. Bhatnagar. State-slice: New paradigm of multi-query optimization of window-based stream queries. In VLDB, pages 619--630, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. E. Wu, Y. Diao, and S. Rizvi. High-performance complex event processing over streams. In SIGMOD, pages 407--418, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Zhu, E. A. Rundensteiner, and G. T. Heineman. Dynamic plan migration for continuous queries over data streams. In SIGMOD, pages 431--442, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
      June 2011
      1364 pages
      ISBN:9781450306614
      DOI:10.1145/1989323

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 June 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader