ABSTRACT
Scale-out stream processing engines (SPEs) are powering large big data applications on high velocity data streams. Industrial setups require SPEs to sustain outages, varying data rates, and low-latency processing. SPEs need to transparently reconfigure stateful queries during runtime. However, state-of-the-art SPEs are not ready yet to handle on-the-fly reconfigurations of queries with terabytes of state due to three problems. These are network overhead for state migration, consistency, and overhead on data processing. In this paper, we propose Rhino, a library for efficient reconfigurations of running queries in the presence of very large distributed state. Rhino provides a handover protocol and a state migration protocol to consistently and efficiently migrate stream processing among servers. Overall, our evaluation shows that Rhino scales with state sizes of up to TBs, reconfigures a running query 15 times faster than the state-of-the-art, and reduces latency by three orders of magnitude upon a reconfiguration.
Supplemental Material
- Tyler Akidau, Alex Balikov, Kaya Bekiroug lu, Slava Chernyak, Josh Haberman, Reuven Lax, Sam McVeety, Daniel Mills, Paul Nordstrom, and Sam Whittle. 2013. MillWheel: fault-tolerant stream processing at internet scale. PVLDB (2013).Google Scholar
- Tyler Akidau, Robert Bradshaw, Craig Chambers, Slava Chernyak, Rafael J Fernández-Moctezuma, Reuven Lax, Sam McVeety, Daniel Mills, Frances Perry, Eric Schmidt, et al. 2015. The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. PVLDB (2015).Google Scholar
- Alexander Alexandrov, Rico Bergmann, Stephan Ewen, J. Freytag, Fabian Hueske, Arvid Heise, Odej Kao, Marcus Leich, Ulf Leser, Volker Markl, Felix Naumann, Mathias Peters, Astrid Rheinl"ander, Matthias Sax, Sebastian Schelter, Mareike Höger, Kostas Tzoumas, and Daniel Warneke. 2014. The Stratosphere Platform for Big Data Analytics. The VLDB Journal (2014).Google Scholar
- Peter A. Alsberg and John D. Day. 1976. A Principle for Resilient Sharing of Distributed Resources. In ICSE.Google Scholar
- Paris Carbone, Stephan Ewen, Gyula Fóra, Seif Haridi, Stefan Richter, and Kostas Tzoumas. 2017. State Management in Apache Flink: Consistent Stateful Distributed Stream Processing. PVLDB Endow. (2017).Google Scholar
- Raul Castro Fernandez, Matteo Migliavacca, Evangelia Kalyvianaki, and Peter Pietzuch. 2013. Integrating Scale out and Fault Tolerance in Stream Processing Using Operator State Management. In ACM SIGMOD.Google Scholar
- Raul Castro Fernandez, Matteo Migliavacca, Evangelia Kalyvianaki, and Peter Pietzuch. 2014. Making State Explicit for Imperative Big Data Processing. In USENIX ATC.Google Scholar
- Craig Chambers, Ashish Raniwala, Frances Perry, Stephen Adams, Robert Henry, Robert Bradshaw, and Nathan. 2010. FlumeJava: Easy, Efficient Data-Parallel Pipelines. In ACM SIGPLAN.Google Scholar
- Badrish Chandramouli, Guna Prasaad, Donald Kossmann, Justin Levandoski, James Hunter, and Mike Barnett. 2018. FASTER: A Concurrent Key-Value Store with In-Place Updates. SIGMOD.Google ScholarDigital Library
- K. Mani Chandy and Leslie Lamport. 1985. Distributed Snapshots: Determining Global States of Distributed Systems. ACM TOCS (1985).Google ScholarDigital Library
- Confluent. 2017. Running Kafka in Production. https://docs.confluent.io/current/kafka/deployment.htmlGoogle Scholar
- Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon's Highly Available Key-value Store. ACM SIGOPS (2007).Google ScholarDigital Library
- Facebook. 2012. RocksDB.org. Facebook Open Source. https://rocksdb.org/Google Scholar
- Facebook. 2017. RocksDB Tuning Guide. https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-GuideGoogle Scholar
- Raul Castro Fernandez, Matteo Migliavacca, Evangelia Kalyvianaki, and Peter Pietzuch. 2014. Making state explicit for imperative big data processing. In 2014 USENIX ATC).Google Scholar
- Apache Flink. 2015. Apache Flink Configuration. https://ci.apache.org/projects/flink/flink-docs-master/Google Scholar
- Avrilia Floratou, Ashvin Agrawal, Bill Graham, Sriram Rao, and Karthik Ramasamy. 2017. Dhalion: Self-Regulating Stream Processing in Heron. PVLDB (2017).Google ScholarDigital Library
- Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google File System. In ACM SOSP.Google Scholar
- Thomas Heinze, Yuanzhen Ji, Lars Roediger, Valerio Pappalardo, Andreas Meister, Zbigniew Jerzak, and Christof Fetzer. 2015. FUGU: Elastic Data Stream Processing with Latency Constraints. IEEE Data Eng. Bull. (2015).Google Scholar
- Moritz Hoffmann, Andrea Lattuada, Frank McSherry, Vasiliki Kalavri, and Timothy Roscoe. 2019 a. Megaphone: Latency-conscious state migration. https://github.com/strymon-system/megaphoneGoogle Scholar
- Moritz Hoffmann, Andrea Lattuada, Frank McSherry, Vasiliki Kalavri, and Timothy Roscoe. 2019 b. Megaphone: Latency-conscious State Migration for Distributed Streaming Dataflows. VLDB (2019).Google Scholar
- Jeong-Hyon Hwang, Magdalena Balazinska, Alexander Rasin, Ugur Cetintemel, Michael Stonebraker, and Stan Zdonik. 2005. High-Availability Algorithms for Distributed Stream Processing. In IEEE ICDE.Google Scholar
- Gabriela Jacques-Silva, Ran Lei, Luwei Cheng, Guoqiang Jerry Chen, Kuen Ching, Tanji Hu, Yuan Mei, Kevin Wilfong, Rithin Shetty, Serhat Yilmaz, Anirban Banerjee, Benjamin Heintz, Shridar Iyer, and Anshul Jaiswal. 2018. Providing Streaming Joins As a Service at Facebook. PVLDB (2018).Google Scholar
- Vasiliki Kalavri, John Liagouris, Moritz Hoffmann, Desislava Dimitrova, Matthew Forshaw, and Timothy Roscoe. 2018. Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows. In USENIX OSDI.Google Scholar
- David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine, and Daniel Lewin. 1997. Consistent Hashing and Random Trees.. In ACM STOC.Google Scholar
- Jeyhun Karimov, Tilmann Rabl, Asterios Katsifodimos, Roman Samarev, Henri Heiskanen, and Volker Markl. 2018. Benchmarking Distributed Stream Data Processing Systems. In IEEE ICDE 2018.Google Scholar
- Klaviyo. 2019. Apache Flink Performance Optimization. https://klaviyo.tech/flinkperf-c7bd28acc67Google Scholar
- H. T. Kung, Trevor Blackwell, and Alan Chapman. 1994. Credit-Based Flow Control for ATM Networks: Credit Update Protocol, Adaptive Credit Allocation and Statistical Multiplexing. In ACM SIGCOMM.Google Scholar
- Avinash Lakshman and Prashant Malik. 2010. Cassandra: A Decentralized Structured Storage System. ACM SIGOPS (2010).Google Scholar
- Luo Mai, Kai Zeng, Rahul Potharaju, Le Xu, Shivaram Venkataraman, Paolo Costa, Terry Kim, Saravanan Muthukrishnan, Vamsi Kuppa, Sudheer Dhulipalla, and Sriram Rao. 2018. Chi: A Scalable and Programmable Control Plane for Distributed Stream Processing Systems. VLDB (2018).Google ScholarDigital Library
- Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Mart'in Abadi. 2013. Naiad: A Timely Dataflow System. In ACM SOSP.Google ScholarDigital Library
- Raghunath Nambiar, Meikel Poess, Andrew Masland, H. Reza Taheri, Andrew Bond, Forrest Carman, and Michael Majdalany. 2013. TPC State of the Council 2013. In 5th TPC Technology Conference on Performance Characterization and Benchmarking - Volume 8391.Google Scholar
- Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, David Garc'i a-Soriano, Nicolas Kourtellis, and Marco Serafini. 2015. The power of both choices: Practical load balancing for distributed stream processing engines. In IEEE ICDE.Google Scholar
- Netflix. 2017. Keystone Real-time Stream Processing Platform. https://medium.com/netflix-techblog/keystone-real-time-stream-processing-platform-a3ee651812aGoogle Scholar
- Shadi A. Noghabi, Kartik Paramasivam, Yi Pan, Navina Ramesh, Jon Bringhurst, Indranil Gupta, and Roy H. Campbell. 2017. Samza: Stateful Scalable Stream Processing at LinkedIn. PVLDB (2017).Google Scholar
- Brian M. Oki and Barbara H. Liskov. 1988. Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems. In ACM PODC.Google Scholar
- Konstantin Shvachko, Hairong Kuang, Sanjay Radia, and Robert Chansler. 2010. The hadoop distributed file system. In IEEE MSST.Google Scholar
- Tencent. 2018. Oceanus: A one-stop platform for real time stream processing. https://www.ververica.com/blog/oceanus-platform-powered-by-apache-flinkGoogle Scholar
- Quoc-Cuong To, Juan Soto, and Volker Markl. 2018. A Survey of State Management in Big Data Processing Systems. The VLDB Journal (2018).Google Scholar
- Pete Tucker, Kristin Tufte, Vassilis Papadimos, and David Maier. 2004. NEXMark - A Benchmark for Queries over Data Streams DRAFT. (2004).Google Scholar
- Uber. 2018. Introducing AthenaX, Uber Engineering's Open Source Streaming Analytics Platform. https://eng.uber.com/athenax/Google Scholar
- Robbert van Renesse and Fred B. Schneider. 2004. Chain Replication for Supporting High Throughput and Availability. In USENIX OSDI.Google ScholarDigital Library
- Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A Scalable, High-performance Distributed File System. In USENIX OSDI.Google Scholar
- Chenggang Wu, Jose Faleiro, Yihan Lin, and Joseph Hellerstein. 2019. Anna: A kvs for any scale. IEEE TKDE (2019).Google Scholar
- Yingjun Wu and Kian-Lee Tan. 2015. ChronoStream: Elastic stateful stream computation in the cloud. In IEEE ICDE.Google Scholar
- Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. 2013. Discretized Streams: Fault-tolerant Streaming Computation at Scale. In ACM SOSP.Google ScholarDigital Library
- Matei Zaharia, Reynold S Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J Franklin, et al. 2016. Apache Spark: A unified engine for big data processing. CACM (2016).Google ScholarDigital Library
- Steffen Zeuch, Ankit Chaudhary, Bonaventura Del Monte, Haralampos Gavriilidis, Dimitrios Giouroukis, Philipp M Grulich, Sebastian Breß, Jonas Traub, and Volker Markl. 2020. The NebulaStream Platform: Data and Application Management for the Internet of Things. In CIDR.Google Scholar
- Yali Zhu, Elke Rundensteiner, and George Heineman. 2004. Dynamic Plan Migration for Continuous Queries over Data Streams. In ACM SIGMOD.Google Scholar
Index Terms
- Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines
Recommendations
Rethinking Stateful Stream Processing with RDMA
SIGMOD '22: Proceedings of the 2022 International Conference on Management of DataRemote Direct Memory Access (RDMA) hardware has bridged the gap between network and main memory speed and thus invalidated the common assumption that network is often the bottleneck in distributed data processing systems. However, high-speed networks do ...
Integrating scale out and fault tolerance in stream processing using operator state management
SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of DataAs users of "big data" applications expect fresh results, we witness a new breed of stream processing systems (SPS) that are designed to scale to large numbers of cloud-hosted machines. Such systems face new challenges: (i) to benefit from the "pay-as-...
A new operator for efficient stream-relation join processing in data streaming engines
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge ManagementIn the last decade, Stream Processing Engines (SPEs) have emerged as a new processing paradigm that can process huge amounts of data while retaining low latency and high-throughputs. Yet, it is often necessary to join streaming data with traditional ...
Comments