skip to main content
10.1145/3127479.3128601acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access
Best Paper

Occupy the cloud: distributed computing for the 99%

Published:24 September 2017Publication History

ABSTRACT

Distributed computing remains inaccessible to a large number of users, in spite of many open source platforms and extensive commercial offerings. While distributed computation frameworks have moved beyond a simple map-reduce model, many users are still left to struggle with complex cluster management and configuration tools, even for running simple embarrassingly parallel jobs. We argue that stateless functions represent a viable platform for these users, eliminating cluster management overhead, fulfilling the promise of elasticity. Furthermore, using our prototype implementation, PyWren, we show that this model is general enough to implement a number of distributed computing models, such as BSP, efficiently. Extrapolating from recent trends in network bandwidth and the advent of disaggregated storage, we suggest that stateless functions are a natural fit for data processing in future computing environments.

References

  1. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al. Tensorflow: A system for large-scale machine learning. In OSDI (2016).Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., et al. A view of cloud computing. CACM 53, 4 (2010), 50--58.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Asanovic, K., and Patterson, D. Firebox: A hardware building block for 2020 warehouse-scale computers. In FAST (2014).Google ScholarGoogle Scholar
  4. Serverless Reference Architecture: MapReduce. https://github.com/awslabs/lambda-refarch-mapreduce.Google ScholarGoogle Scholar
  5. Canny, J., and Zhao, H. Big data analytics with small footprint: Squaring the cloud. In KDD (2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Carriero, N., and Gelernter, D. Linda in context. CACM 32, 4 (Apr. 1989).Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. cloudpickle: Extended pickling support for python objects. https://github.com/cloudpipe/cloudpickle.Google ScholarGoogle Scholar
  8. Douze, M., Jégou, H., Sandhawalia, H., Amsaleg, L., and Schmid, C. Evaluation of gist descriptors for web-scale image search. In ACM International Conference on Image and Video Retrieval (2009).Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. IEEE P802.3ba, 40Gb/s and 100Gb/s Ethernet Task Force. http://www.ieee802.org/3/ba/.Google ScholarGoogle Scholar
  10. Fang, L., Nguyen, K., Xu, G., Demsky, B., and Lu, S. Interruptible tasks: Treating memory pressure as interrupts for highly scalable data-parallel programs. In SOSP (2015).Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Fouladi, S., Wahby, R. S., Shacklett, B., Balasubramaniam, K. V., Zeng, W., Bhalerao, R., Sivaraman, A., Porter, G., and Winstein, K. Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads. In NSDI (2017).Google ScholarGoogle Scholar
  12. G. Ananthanarayanan, A. Ghodsi, S. Shenker, I. Stoica. Disk-Locality in Datacenter Computing Considered Irrelevant. In Proc. HotOS (2011).Google ScholarGoogle Scholar
  13. Gao, P. X., Narayan, A., Karandikar, S., Carreira, J., Han, S., Agarwal, R., Ratnasamy, S., and Shenker, S. Network requirements for resource disaggregation. In OSDI (2016).Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Han, S., Egi, N., Panda, A., Ratnasamy, S., Shi, G., and Shenker, S. Network support for resource disaggregation in next-generation datacenters. In HotNets (2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Han, S., and Ratnasamy, S. Large-scale computation not at the cost of expressiveness. In HotOS (2013).Google ScholarGoogle Scholar
  16. Hendrickson, S., Sturdevant, S., Harter, T., Venkataramani, V., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. Serverless computation with OpenLambda. In HotCloud (2016).Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F. B., and Babu, S. Starfish: A self-tuning system for big data analytics. In CIDR (2011).Google ScholarGoogle Scholar
  18. Hettrick, S., Antonioletti, M., Carr, L., Chue Hong, N., Crouch, S., De Roure, D., Emsley, I., Goble, C., Hay, A., Inupakutika, D., Jackson, M., Nenadic, A., Parkinson, T., Parsons, M. I., Pawlik, A., Peru, G., Proeme, A., Robinson, J., and Sufi, S. Uk research software survey 2014. Dec. 2014. Google ScholarGoogle ScholarCross RefCross Ref
  19. Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A., Katz, R., Shenker, S., and Stoica, I. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In Proc. NSDI (2011).Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. HP The Machine: Our vision for the Future of Computing. https://www.labs.hpe.com/the-machine.Google ScholarGoogle Scholar
  21. Isard, M., Prabhakaran, V., Currey, J., Wieder, U., Talwar, K., and Goldberg, A. Quincy: Fair Scheduling for Distributed Computing Clusters. In Proc. SOSP (2009), pp. 261--276.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Lagar-Cavilla, H. A., Whitney, J. A., Scannell, A. M., Patchin, P., Rumble, S. M., de Lara, E., Brudno, M., and Satyanarayanan, M. Snowflock: Rapid virtual machine cloning for cloud computing. In EuroSys (2009).Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Li, M., Andersen, D. G., Park, J. W., Smola, A. J., Ahmed, A., Josifovski, V., Long, J., Shekita, E. J., and Su, B.-Y. Scaling distributed machine learning with the parameter server. In OSDI (2014).Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. McAuley, J., Targett, C., Shi, Q., and Van Den Hengel, A. Image-based recommendations on styles and substitutes. In SIGIR (2015).Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. McSherry, F., Isard, M., and Murray, D. G. Scalability! but at what COST? In HotOS (2015).Google ScholarGoogle Scholar
  26. Momcheva, I., and Tollerud, E. Software Use in Astronomy: an Informal Survey. arXiv 1507.03989 (2015).Google ScholarGoogle Scholar
  27. Nightingale, E. B., Elson, J., Fan, J., Hofmann, O., Howell, J., and Suzue, Y. Flat datacenter storage. In OSDI (2012).Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Niu, F., Recht, B., Re, C., and Wright, S. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In NIPS (2011).Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Oliva, A., and Torralba, A. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of computer vision 42, 3 (2001), 145--175.Google ScholarGoogle Scholar
  30. O'Malley, O. TeraByte Sort on Apache Hadoop. http://sortbenchmark.org/YahooHadoop.pdf.Google ScholarGoogle Scholar
  31. OpenWhisk. https://developer.ibm.com/openwhisk/.Google ScholarGoogle Scholar
  32. Ousterhout, K., Panda, A., Rosen, J., Venkataraman, S., Xin, R., Ratnasamy, S., Shenker, S., and Stoica, I. The case for tiny tasks in compute clusters. In HotOS (2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ousterhout, K., Wendell, P., Zaharia, M., and Stoica, I. Sparrow: distributed, low latency scheduling. In SOSP (2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Peng, D., and Dabek, F. Large-scale incremental processing using distributed transactions and notifications. In OSDI (2010).Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Power, R., and Li, J. Piccolo: Building fast, distributed programs with partitioned tables. In OSDI (2010).Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Redis server side scripting. https://redis.io/commands/eval.Google ScholarGoogle Scholar
  37. Redis benchmarks. https://redis.io/topics/benchmarks.Google ScholarGoogle Scholar
  38. Rumble, S. M., Ongaro, D., Stutsman, R., Rosenblum, M., and Ousterhout, J. K. It's Time for Low Latency. In Proc. HotOS (2011).Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Li, F.-F. ImageNet Large Scale Visual Recognition Challenge. IJCV 115, 3 (2015), 211--252.Google ScholarGoogle Scholar
  40. Schwarzkopf, M., Konwinski, A., Abd-El-Malek, M., and Wilkes, J. Omega: flexible, scalable schedulers for large compute clusters. In Proc. EuroSys (2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Scott, C. Latency trends. http://colin-scott.github.io/blog/2012/12/24/latency-trends/.Google ScholarGoogle Scholar
  42. Shvachko, K., Kuang, H., Radia, S., and Chansler, R. The Hadoop Distributed File System. In Mass storage systems and technologies (MSST) (2010).Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Sort Benchmark. http://sortbenchmark.org.Google ScholarGoogle Scholar
  44. Tuning Java Garbage Collection for Apache Spark Applications. https://goo.gl/SIWlqx.Google ScholarGoogle Scholar
  45. Tuning Spark. https://spark.apache.org/docs/latest/tuning.html#garbage-collection-tuning.Google ScholarGoogle Scholar
  46. Vavilapalli, V. K., Murthy, A. C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., et al. Apache Hadoop YARN: Yet another resource negotiator. In SoCC (2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Venkataraman, S., Yang, Z., Franklin, M., Recht, B., and Stoica, I. Ernest: Efficient performance prediction for large-scale advanced analytics. In NSDI (2016).Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. X1 instances. https://aws.amazon.com/ec2/instance-types/x1/.Google ScholarGoogle Scholar
  49. Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M., Shenker, S., and Stoica, I. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In Proc. NSDI (2011).Google ScholarGoogle Scholar

Index Terms

  1. Occupy the cloud: distributed computing for the 99%

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SoCC '17: Proceedings of the 2017 Symposium on Cloud Computing
        September 2017
        672 pages
        ISBN:9781450350280
        DOI:10.1145/3127479

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 September 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate169of722submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader