skip to main content
research-article

GRETA: graph-based real-time event trend aggregation

Published:01 September 2017Publication History
Skip Abstract Section

Abstract

Streaming applications from algorithmic trading to traffic management deploy Kleene patterns to detect and aggregate arbitrarily-long event sequences, called event trends. State-of-the-art systems process such queries in two steps. Namely, they first construct all trends and then aggregate them. Due to the exponential costs of trend construction, this two-step approach suffers from both a long delays and high memory costs. To overcome these limitations, we propose the Graph-based Real-time Event Trend Aggregation (GRETA) approach that dynamically computes event trend aggregation without first constructing these trends. We define the GRETA graph to compactly encode all trends. Our GRETA runtime incrementally maintains the graph, while dynamically propagating aggregates along its edges. Based on the graph, the final aggregate is incrementally updated and instantaneously returned at the end of each query window. Our GRETA runtime represents a win-win solution, reducing both the time complexity from exponential to quadratic and the space complexity from exponential to linear in the number of events. Our experiments demonstrate that GRETA achieves up to four orders of magnitude speed-up and up to 50--fold memory reduction compared to the state-of-the-art two-step approaches.

References

  1. Esper. http://www.espertech.com/.Google ScholarGoogle Scholar
  2. Flink. https://flink.apache.org/.Google ScholarGoogle Scholar
  3. Google Dataflow. https://cloud.google.com/dataflow/.Google ScholarGoogle Scholar
  4. Microsoft StreamInsight. https://technet.microsoft.com/en-us/library/ee362541%28v=sql.111%29.aspx.Google ScholarGoogle Scholar
  5. Stock data. http://davis.wpi.edu/datasets/Stock_Trace_Data/.Google ScholarGoogle Scholar
  6. J. Agrawal, Y. Diao, D. Gyllstrom, and N. Immerman. Efficient pattern matching over event streams. In SIGMOD, pages 147--160, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Arasu, M. Cherniack, E. Galvez, D. Maier, A. S. Maskey, E. Ryvkina, M. Stonebraker, and R. Tibbetts. Linear road: A stream data management benchmark. PVLDB, 30(1):480--491, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Arasu and J. Widom. Resource sharing in continuous sliding-window aggregates. PVLDB, 30(1):336--347, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Demers, J. Gehrke, B. Panda, M. Riedewald, V. Sharma, and W. White. Cayuga: A general purpose event monitoring system. In CIDR, pages 412--422, 2007.Google ScholarGoogle Scholar
  10. T. M. Ghanem, M. A. Hammad, M. F. Mokbel, W. G. Aref, and A. K. Elmagarmid. Incremental evaluation of sliding-window queries over data streams. IEEE Trans. on Knowl. and Data Eng., 19(1):57--72, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh. Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals. Data Min. Knowl. Discov., 1(1):29--53, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Khan. 501 Stock Market Tips and Guidelines. Writers Club Press, 2002.Google ScholarGoogle Scholar
  13. S. Krishnamurthy, C. Wu, and M. J. Franklin. On-the-fly sharing for streamed aggregation. In SIGMOD, pages 623--634, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Lerner and D. Shasha. AQuery: Query language for ordered data, optimization techniques, and experiments. PVLDB, 29(1):345--356, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker. No pane, no gain: Efficient evaluation of sliding window aggregates over data streams. In SIGMOD, pages 39--44, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker. Semantics and evaluation techniques for window aggregates in data streams. In SIGMOD, pages 311--322, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Li, K. Tufte, V. Shkapenyuk, V. Papadimos, T. Johnson, and D. Maier. Out-of-order processing: a new architecture for high-performance stream systems. PVLDB, 1(1):274--288, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Liu, M. Li, D. Golovnya, E. A. Rundensteiner, and K. T. Claypool. Sequence pattern query processing over out-of-order event streams. In ICDE, pages 784--795, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Liu, E. A. Rundensteiner, K. Greenfield, C. Gupta, S. Wang, I. Ari, and A. Mehta. E-Cube: Multi-dimensional event sequence analysis using hierarchical pattern query sharing. In SIGMOD, pages 889--900, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. E. Lo, B. Kao, W.-S. Ho, S. D. Lee, C. K. Chui, and D. W. Cheung. OLAP on sequence data. In SIGMOD, pages 649--660, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Meehan, N. Tatbul, S. Zdonik, C. Aslantas, U. Cetintemel, J. Du, T. Kraska, S. Madden, D. Maier, A. Pavlo, M. Stonebraker, K. Tufte, and H. Wang. S-Store: Streaming Meets Transaction Processing. PVLDB, 8(13):2134--2145, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Mei and S. Madden. ZStream: A Cost-based Query Processor for Adaptively Detecting Composite Events. In SIGMOD, pages 193--206, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. I. Motakis and C. Zaniolo. Temporal aggregation in active database rules. In SIGMOD, pages 440--451, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. O. Poppe, C. Lei, S. Ahmed, and E. A. Rundensteiner. Complete event trend detection in high-rate event streams. In SIGMOD, pages 109--124, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. O. Poppe, C. Lei, E. A. Rundensteiner, and D. Maier. GRETA: Graph-based Real-time Event Trend Aggregation. http://users.wpi.edu/~opoppe/papers/Greta-full.pdf, 2017. Technical report in progress.Google ScholarGoogle Scholar
  26. Y. Qi, L. Cao, M. Ray, and E. A. Rundensteiner. Complex event analytics: Online aggregation of stream sequence patterns. In SIGMOD, pages 229--240, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Sadri, C. Zaniolo, A. Zarkesh, and J. Abidi. Expressing and optimizing sequence queries in database systems. In ACM Trans. on Database Systems, pages 282--318, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. U. Schöning. Theoretische Informatik - kurzgefaßt (3. Aufl.). Spektrum Akademischer Verlag, 1997.Google ScholarGoogle Scholar
  29. P. Seshadri, M. Livny, and R. Ramakrishnan. SEQ: Design and Implementation of a Sequence Database System. PVLDB, 22(1):99--110, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. K. Tangwongsan, M. Hirzel, S. Schneider, and K.-L. Wu. General incremental sliding-window aggregation. PVLDB, 8(7):702--713, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. E. Wu, Y. Diao, and S. Rizvi. High-performance Complex Event Processing over streams. In SIGMOD, pages 407--418, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Zhang, Y. Diao, and N. Immerman. On complexity and optimization of expensive queries in Complex Event Processing. In SIGMOD, pages 217--228, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. R. Zhang, N. Koudas, B. C. Ooi, and D. Srivastava. Multiple aggregations over data streams. In SIGMOD, pages 299--310, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. R. Zhang, N. Koudas, B. C. Ooi, D. Srivastava, and P. Zhou. Streaming multiple aggregations using phantoms. PVLDB, 19(4):557--583, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 11, Issue 1
    Proceedings of the 44th International Conference on Very Large Data Bases, Rio de Janeiro, Brazil
    September 2017
    120 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 September 2017
    Published in pvldb Volume 11, Issue 1

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader