ABSTRACT
Tasks in modern data parallel clusters have highly diverse resource requirements, along CPU, memory, disk and network. Any of these resources may become bottlenecks and hence, the likelihood of wasting resources due to fragmentation is now larger. Today's schedulers do not explicitly reduce fragmentation. Worse, since they only allocate cores and memory, the resources that they ignore (disk and network) can be over-allocated leading to interference, failures and hogging of cores or memory that could have been used by other tasks. We present Tetris, a cluster scheduler that packs, i.e., matches multi-resource task requirements with resource availabilities of machines so as to increase cluster efficiency (makespan). Further, Tetris uses an analog of shortest-running-time-first to trade-off cluster efficiency for speeding up individual jobs. Tetris' packing heuristics seamlessly work alongside a large class of fairness policies. Trace-driven simulations and deployment of our prototype on a 250 node cluster shows median gains of 30% in job completion time while achieving nearly perfect fairness.
- Apache Hadoop. http://hadoop.apache.org.Google Scholar
- Facebook Data Grows By Over 500 TB Daily. http://bit.ly/1p5EV3c.Google Scholar
- Hadoop MapReduce - Capacity Scheduler. http://bit.ly/1tGpbDN.Google Scholar
- Hadoop MapReduce - Fair Scheduler. http://bit.ly/1p7sJ1I.Google Scholar
- Hadoop YARN Project. http://bit.ly/1iS8xvP.Google Scholar
- Petabyte Storage at Half Price with QFS. http://bit.ly/1x4A6vF.Google Scholar
- S. Agarwal et al. Re-optimizing data parallel computing. In NSDI, 2012. Google ScholarDigital Library
- M. Al-Fares et al. A Scalable, Commodity Data Center Network Architecture. In SIGCOMM, 2008. Google ScholarDigital Library
- Y. Azar et al. Tight Bounds for Online Vector Bin Packing. In STOC, 2013. Google ScholarDigital Library
- M. Chowdhury et al. Managing Data Transfers in Computer Clusters with Orchestra. In SIGCOMM, 2011. Google ScholarDigital Library
- M. Chowdhury et al. Leveraging Endpoint Flexibility in Data-Intensive Clusters. In SIGCOMM, 2013. Google ScholarDigital Library
- A. Ghodsi et al. Dominant Resource Fairness: Fair Allocation Of Multiple Resource Types. In NSDI, 2011. Google ScholarDigital Library
- A. Greenberg et al. A Scalable and Flexible Datacenter Network . In SIGCOMM, 2009. Google ScholarDigital Library
- S. Gulwani et al. SPEED: Precise And Efficient Static Estimation Of Program Computational Complexity. In POPL, 2009. Google ScholarDigital Library
- C. Guo et al. BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers. In SIGCOMM, 2009. Google ScholarDigital Library
- M. Harchol-Balter et al. Connection Scheduling in Web Servers. In USITS, 1999.Google Scholar
- M. Isard et al. Dryad: Distributed Data-Parallel Programs From Sequential Building Blocks. In EuroSys, 2007. Google ScholarDigital Library
- M. Isard et al. Quincy: Fair Scheduling For Distributed Computing Clusters. In SOSP, 2009. Google ScholarDigital Library
- L. Lu et al. Predictive VM Consolidation on Multiple Resources: Beyond Load Balancing. In IWQoS, 2013.Google ScholarCross Ref
- R. Panigrahy et al. Heuristics for Vector Bin Packing. In MSR TR, 2011.Google Scholar
- A. Rasmussen et al. Themis: An I/O-Efficient MapReduce. In SoCC, 2012. Google ScholarDigital Library
- A. Shieh et al. Sharing the Data Center Network. In Usenix NSDI, 2011. Google ScholarDigital Library
- T. Tannenbaum et al. Condor -- A Distributed Job Scheduler. In Beowulf Cluster Computing with Linux. MIT Press, 2001. Google ScholarDigital Library
- A. Thusoo et al. Hive: A Warehousing Solution Over A Map-Reduce Framework. Proc. VLDB Endow., 2009. Google ScholarDigital Library
- V. V. Vazirani. Approximation Algorithms. In Springer-Verlag, 2001. Google ScholarDigital Library
- A. Wierman et al. Classifying Scheduling Policies with Respect to Unfairness in an M/GI/1 . In SIGMETRICS, 2003. Google ScholarDigital Library
- G. J. Woeginger. There Is No Asymptotic Ptas For Two-Dimensional Vector Packing. In Information Processing Letters, 1997. Google ScholarDigital Library
- C.-W. Yang et al. Tail Asymptotics For Policies Favoring Short Jobs In A Many-Flows Regime. In SIGMETRICS, 2006. Google ScholarDigital Library
- M. Zaharia et al. Delay Scheduling: A Simple Technique For Achieving Locality And Fairness In Cluster Scheduling. In EuroSys, 2010. Google ScholarDigital Library
- J. Zhou et al. SCOPE: Parallel Databases Meet MapReduce. Proc. VLDB Endow., 2012. Google ScholarDigital Library
Index Terms
- Multi-resource packing for cluster schedulers
Recommendations
Multi-resource packing for cluster schedulers
SIGCOMM'14Tasks in modern data parallel clusters have highly diverse resource requirements, along CPU, memory, disk and network. Any of these resources may become bottlenecks and hence, the likelihood of wasting resources due to fragmentation is now larger. Today'...
Multi-resource Packing for Job Scheduling in Virtual Machine Based Cloud Environment
SOSE '15: Proceedings of the 2015 IEEE Symposium on Service-Oriented System EngineeringTo efficiently schedule jobs with highly diverse resource requirements along CPU, memory and bandwidth for job performance and resource utilization in a virtual machine based cloud environment, the multi-resource job scheduler is proposed to pack tasks ...
Improved Scheduling with a Shared Resource
Combinatorial Optimization and ApplicationsAbstractWe consider the following shared-resource scheduling problem: Given a set of jobs J, for each we must schedule a job-specific processing volume of . A total resource of 1 is available at any time. Jobs have a resource requirement , ...
Comments