skip to main content
10.1145/2619239.2626334acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Free Access

Multi-resource packing for cluster schedulers

Published:17 August 2014Publication History

ABSTRACT

Tasks in modern data parallel clusters have highly diverse resource requirements, along CPU, memory, disk and network. Any of these resources may become bottlenecks and hence, the likelihood of wasting resources due to fragmentation is now larger. Today's schedulers do not explicitly reduce fragmentation. Worse, since they only allocate cores and memory, the resources that they ignore (disk and network) can be over-allocated leading to interference, failures and hogging of cores or memory that could have been used by other tasks. We present Tetris, a cluster scheduler that packs, i.e., matches multi-resource task requirements with resource availabilities of machines so as to increase cluster efficiency (makespan). Further, Tetris uses an analog of shortest-running-time-first to trade-off cluster efficiency for speeding up individual jobs. Tetris' packing heuristics seamlessly work alongside a large class of fairness policies. Trace-driven simulations and deployment of our prototype on a 250 node cluster shows median gains of 30% in job completion time while achieving nearly perfect fairness.

References

  1. Apache Hadoop. http://hadoop.apache.org.Google ScholarGoogle Scholar
  2. Facebook Data Grows By Over 500 TB Daily. http://bit.ly/1p5EV3c.Google ScholarGoogle Scholar
  3. Hadoop MapReduce - Capacity Scheduler. http://bit.ly/1tGpbDN.Google ScholarGoogle Scholar
  4. Hadoop MapReduce - Fair Scheduler. http://bit.ly/1p7sJ1I.Google ScholarGoogle Scholar
  5. Hadoop YARN Project. http://bit.ly/1iS8xvP.Google ScholarGoogle Scholar
  6. Petabyte Storage at Half Price with QFS. http://bit.ly/1x4A6vF.Google ScholarGoogle Scholar
  7. S. Agarwal et al. Re-optimizing data parallel computing. In NSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Al-Fares et al. A Scalable, Commodity Data Center Network Architecture. In SIGCOMM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Azar et al. Tight Bounds for Online Vector Bin Packing. In STOC, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Chowdhury et al. Managing Data Transfers in Computer Clusters with Orchestra. In SIGCOMM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Chowdhury et al. Leveraging Endpoint Flexibility in Data-Intensive Clusters. In SIGCOMM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Ghodsi et al. Dominant Resource Fairness: Fair Allocation Of Multiple Resource Types. In NSDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Greenberg et al. A Scalable and Flexible Datacenter Network . In SIGCOMM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Gulwani et al. SPEED: Precise And Efficient Static Estimation Of Program Computational Complexity. In POPL, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. Guo et al. BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers. In SIGCOMM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Harchol-Balter et al. Connection Scheduling in Web Servers. In USITS, 1999.Google ScholarGoogle Scholar
  17. M. Isard et al. Dryad: Distributed Data-Parallel Programs From Sequential Building Blocks. In EuroSys, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Isard et al. Quincy: Fair Scheduling For Distributed Computing Clusters. In SOSP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. Lu et al. Predictive VM Consolidation on Multiple Resources: Beyond Load Balancing. In IWQoS, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  20. R. Panigrahy et al. Heuristics for Vector Bin Packing. In MSR TR, 2011.Google ScholarGoogle Scholar
  21. A. Rasmussen et al. Themis: An I/O-Efficient MapReduce. In SoCC, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Shieh et al. Sharing the Data Center Network. In Usenix NSDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Tannenbaum et al. Condor -- A Distributed Job Scheduler. In Beowulf Cluster Computing with Linux. MIT Press, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Thusoo et al. Hive: A Warehousing Solution Over A Map-Reduce Framework. Proc. VLDB Endow., 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. V. V. Vazirani. Approximation Algorithms. In Springer-Verlag, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Wierman et al. Classifying Scheduling Policies with Respect to Unfairness in an M/GI/1 . In SIGMETRICS, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. G. J. Woeginger. There Is No Asymptotic Ptas For Two-Dimensional Vector Packing. In Information Processing Letters, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C.-W. Yang et al. Tail Asymptotics For Policies Favoring Short Jobs In A Many-Flows Regime. In SIGMETRICS, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. Zaharia et al. Delay Scheduling: A Simple Technique For Achieving Locality And Fairness In Cluster Scheduling. In EuroSys, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Zhou et al. SCOPE: Parallel Databases Meet MapReduce. Proc. VLDB Endow., 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multi-resource packing for cluster schedulers

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGCOMM '14: Proceedings of the 2014 ACM conference on SIGCOMM
          August 2014
          662 pages
          ISBN:9781450328364
          DOI:10.1145/2619239

          Copyright © 2014 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 17 August 2014

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          SIGCOMM '14 Paper Acceptance Rate45of242submissions,19%Overall Acceptance Rate554of3,547submissions,16%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader