skip to main content
10.1145/2882903.2904441acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Realtime Data Processing at Facebook

Published:14 June 2016Publication History

ABSTRACT

Realtime data processing powers many use cases at Facebook, including realtime reporting of the aggregated, anonymized voice of Facebook users, analytics for mobile applications, and insights for Facebook page administrators. Many companies have developed their own systems; we have a realtime data processing ecosystem at Facebook that handles hundreds of Gigabytes per second across hundreds of data pipelines.

Many decisions must be made while designing a realtime stream processing system. In this paper, we identify five important design decisions that affect their ease of use, performance, fault tolerance, scalability, and correctness. We compare the alternative choices for each decision and contrast what we built at Facebook to other published systems.

Our main decision was targeting seconds of latency, not milliseconds. Seconds is fast enough for all of the use cases we support and it allows us to use a persistent message bus for data transport. This data transport mechanism then paved the way for fault tolerance, scalability, and multiple options for correctness in our stream processing systems Puma, Swift, and Stylus.

We then illustrate how our decisions and systems satisfy our requirements for multiple use cases at Facebook. Finally, we reflect on the lessons we learned as we built and operated these systems.

References

  1. Monoid. https://en.wikipedia.org/wiki/Monoid.Google ScholarGoogle Scholar
  2. Presto. http://prestodb.io.Google ScholarGoogle Scholar
  3. Rocksdb. http://rocksdb.org.Google ScholarGoogle Scholar
  4. Samza. http://samza.apache.org.Google ScholarGoogle Scholar
  5. Scribe. https://github.com/facebook/scribe.Google ScholarGoogle Scholar
  6. Zeromq. https://zeromq.org/.Google ScholarGoogle Scholar
  7. L. Abraham, J. Allen, O. Barykin, V. Borkar, B. Chopra, C. Gerea, D. Merl, J. Metzler, D. Reiss, S. Subramanian, et al. Scuba: diving into data at facebook. In PVLDB, pages 1057--1067, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Agarwal, M. Slee, and M. Kwiatkowski. Thrift: Scalable cross-language services implementation. Technical report, Facebook, 2007.Google ScholarGoogle Scholar
  9. T. Akidau, A. Balikov, K. Bekiro\uglu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P. Nordstrom, and S. Whittle. Millwheel: Fault-tolerant stream processing at internet scale. PVLDB, 6(11):1033--1044, Aug. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. J. Fernández-Moctezuma, R. Lax, S. McVeety, D. Mills, F. Perry, E. Schmidt, and S. Whittle. The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. PVLDB, 8(12):1792--1803, Aug. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Ananthanarayanan, V. Basker, S. Das, A. Gupta, H. Jiang, T. Qiu, A. Reznichenko, D. Ryabkov, M. Singh, and S. Venkataraman. Photon: fault-tolerant and scalable joining of continuous data streams. In SIGMOD, pages 577--588, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, U. Srivastava, and J. Widom. Stream: The stanford data stream management system. Technical Report 2004--20, Stanford InfoLab, 2004.Google ScholarGoogle Scholar
  13. M. Balazinska, H. Balakrishnan, S. R. Madden, and M. Stonebraker. Fault-tolerance in the borealis distributed stream processing system. ACM TODS, 33(1):3:1--3:44, Mar. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. O. Boykin, S. Ritchie, I. O'Connell, and J. Lin. Summingbird: A framework for integrating batch and online mapreduce computations. PVLDB, 7(13):1441--1451, Aug. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. N. Bronson, T. Lento, and J. L. Wiener. Open data challenges at facebook. In ICDE, pages 1516--1519, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  16. P. Carbone, G. Fóra, S. Ewen, S. Haridi, and K. Tzoumas. Lightweight asynchronous snapshots for distributed dataflows. CoRR, abs/1506.08603, 2015.Google ScholarGoogle Scholar
  17. R. Castro Fernandez, M. Migliavacca, E. Kalyvianaki, and P. Pietzuch. Integrating scale out and fault tolerance in stream processing using operator state management. In SIGMOD, pages 725--736, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Goel, B. Chopra, C. Gerea, D. Mátáni, J. Metzler, F. Ul Haq, and J. L. Wiener. Fast database restarts at Facebook. In SIGMOD, pages 541--549, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Kreps, N. Narkhede, and J. Rao. Kafka: A distributed messaging system for log processing. In SIGMOD Workshop on Networking Meets Databases, 2011.Google ScholarGoogle Scholar
  20. S. Kulkarni, N. Bhagat, M. Fu, V. Kedigehalli, C. Kellogg, S. Mittal, J. M. Patel, K. Ramasamy, and S. Taneja. Twitter heron: Stream processing at scale. In SIGMOD, pages 239--250, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Meehan, N. Tatbul, S. Zdonik, C. Aslantas, U. Cetintemel, J. Du, T. Kraska, S. Madden, D. Maier, A. Pavlo, M. Stonebraker, K. Tufte, and H. Wang. S-store: Streaming meets transaction processing. PVLDB, 8(13):2134--2145, Sept. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L. Neumeyer, B. Robbins, A. Nair, and A. Kesari. S4: Distributed stream computing platform. In IEEE Data Mining Workshops, pages 170--177, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The Hadoop distributed file system. In Mass Storage Systems and Technologies (MSST), pages 1--10, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Stonebraker, U. Çetintemel, and S. B. Zdonik. The 8 requirements of real-time stream processing. SIGMOD Record, 34(4):42--47, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Sundram. Developing data products. Big Data Spain, 2015. https://www.youtube.com/watch?v=CkEdD6FL7Ug.Google ScholarGoogle Scholar
  26. A. Thusoo, Z. Shao, S. Anthony, D. Borthakur, N. Jain, J. Sen Sarma, R. Murthy, and H. Liu. Data warehousing and analytics infrastructure at facebook. In SIGMOD, pages 1013--1020, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Tibbetts, S. Yang, R. MacNeill, and D. Rydzewski. Streambase liveview: Push-based real-time analytics. StreamBase Systems (Jan 2012), 2011.Google ScholarGoogle Scholar
  28. A. Toshniwal, S. Taneja, A. Shukla, K. Ramasamy, J. M. Patel, S. Kulkarni, J. Jackson, K. Gade, M. Fu, J. Donham, N. Bhagat, S. Mittal, and D. Ryaboy. Storm@twitter. In SIGMOD, pages 147--156, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. L. Wiener. Understanding realtime conversations on Facebook. QCON San Francisco, 2015. https://qconsf.com/sf2015/speakers/janet-wiener.Google ScholarGoogle Scholar
  30. Y. Yu, M. Isard, D. Fetterly, M. Budiu, Ú. Erlingsson, P. K. Gunda, and J. Currey. Dryadlinq: A system for general-purpose distributed data-parallel computing using a high-level language. In OSDI, volume 8, pages 1--14, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, pages 2--2, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker, and I. Stoica. Discretized streams: Fault-tolerant streaming computation at scale. In SOSP, pages 423--438, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Realtime Data Processing at Facebook

                    Recommendations

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in
                    • Published in

                      cover image ACM Conferences
                      SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data
                      June 2016
                      2300 pages
                      ISBN:9781450335317
                      DOI:10.1145/2882903

                      Copyright © 2016 ACM

                      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                      Publisher

                      Association for Computing Machinery

                      New York, NY, United States

                      Publication History

                      • Published: 14 June 2016

                      Permissions

                      Request permissions about this article.

                      Request Permissions

                      Check for updates

                      Qualifiers

                      • research-article

                      Acceptance Rates

                      Overall Acceptance Rate785of4,003submissions,20%

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader