skip to main content
10.1145/2150976.2150982acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Clearing the clouds: a study of emerging scale-out workloads on modern hardware

Authors Info & Claims
Published:03 March 2012Publication History

ABSTRACT

Emerging scale-out workloads require extensive amounts of computational resources. However, data centers using modern server hardware face physical constraints in space and power, limiting further expansion and calling for improvements in the computational density per server and in the per-operation energy. Continuing to improve the computational resources of the cloud while staying within physical constraints mandates optimizing server efficiency to ensure that server hardware closely matches the needs of scale-out workloads.

In this work, we introduce CloudSuite, a benchmark suite of emerging scale-out workloads. We use performance counters on modern servers to study scale-out workloads, finding that today's predominant processor micro-architecture is inefficient for running these workloads. We find that inefficiency comes from the mismatch between the workload needs and modern processors, particularly in the organization of instruction and data memory systems and the processor core micro-architecture. Moreover, while today's predominant micro-architecture is inefficient when executing scale-out workloads, we find that continuing the current trends will further exacerbate the inefficiency in the future. In this work, we identify the key micro-architectural needs of scale-out workloads, calling for a change in the trajectory of server processors that would lead to improved computational density and power efficiency in data centers.

References

  1. Anastassia Ailamaki, David J. DeWitt, Mark D. Hill, and David A. Wood. DBMSs on a modern processor: where does time go? In Proceedings of the 25th International Conference on Very Large Data Bases, September 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alexa, The Web Information Company. http://www.alexa.com/.Google ScholarGoogle Scholar
  3. Apache Mahout: scalable machine-learning and data-mining library. http://mahout.apache.org/.Google ScholarGoogle Scholar
  4. Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. The PARSEC benchmark suite: characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, October 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. Bigtable: a distributed storage system for structured data. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation, volume 7, November 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Liviu Ciortea, Cristian Zamfir, Stefan Bucur, Vitaly Chipounov, and George Candea. Cloud9: a software testing service. ACM SIGOPS Operating Systems Review, 43:5--10, January 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. John D. Davis, James Laudon, and Kunle Olukotun. Maximizing CMP throughput with mediocre cores. In Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, September 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jeffrey Dean and Sanjay Ghemawat. MapReduce: simplified data processing on large clusters. In Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation, volume 6, November 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: Amazon's highly available key-value store. In Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles, October 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. PowerEdge M1000e Blade Enclosure. http://www.dell.com/us/enterprise/p/poweredge-m1000e/pd.aspx.Google ScholarGoogle Scholar
  12. Hadi Esmaeilzadeh, Emily Blem, Renee St. Amant, Karthikeyan Sankaralingam, and Doug Burger. Dark silicon and the end of multicore scaling. In Proceeding of the 38th Annual International Symposium on Computer Architecture, June 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. EuroCloud Server. http://www.eurocloudserver.com.Google ScholarGoogle Scholar
  14. Stijn Eyerman, Lieven Eeckhout, Tejas Karkhanis, and James E. Smith. A performance counter architecture for computing accurate CPI components. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, October 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Facebook Statistics. https://www.facebook.com/press/info.php?statistics.Google ScholarGoogle Scholar
  16. Xiaobo Fan, Wolf-Dietrich Weber, and Luiz André Barroso. Power provisioning for a warehouse-sized computer. In Proceedings of the 34th Annual International Symposium on Computer Architecture, June 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Google Data Centers. http://www.google.com/intl/en/corporate/datacenter/.Google ScholarGoogle Scholar
  18. Zvika Guz, Oved Itzhak, Idit Keidar, Avinoam Kolod, Avi Mendelson, and Uri C. Weiser. Threads vs. caches: modeling the behavior of parallel workloads. In International Conference on Computer Design, October 2010.Google ScholarGoogle ScholarCross RefCross Ref
  19. Rehan Hameed, Wajahat Qadeer, Megan Wachs, Omid Azizi, Alex Solomatnikov, Benjamin C. Lee, Stephen Richardson, Christos Kozyrakis, and Mark Horowitz. Understanding sources of inefficiency in general-purpose chips. In Proceedings of the 37th Annual International Symposium on Computer Architecture, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Nikos Hardavellas, Michael Ferdman, Anastasia Ailamaki, and Babak Falsafi. Toward Dark Silicon in Servers. In IEEE Micro, 31(4):6--15, July-Aug, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Nikos Hardavellas, Michael Ferdman, Babak Falsafi, and Anastasia Ailamaki. Reactive NUCA: near-optimal block placement and replication in distributed caches. In Proceedings of the 36th Annual International Symposium on Computer Architecture, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Nikos Hardavellas, Ippokratis Pandis, Ryan Johnson, Naju Mancheril, Anastassia Ailamaki, and Babak Falsafi. Database servers on chip multiprocessors: limitations and opportunities. In The 3rd Biennial Conference on Innovative Data Systems Research, January 2007.Google ScholarGoogle Scholar
  23. Faban Harness and Benchmark Framework. http://java.net/projects/faban/.Google ScholarGoogle Scholar
  24. Mark Horowitz, Elad Alon, Dinesh Patil, Samuel Naffziger, Rajesh Kumar, and Kerry Bernstein. Scaling, power, and the future of CMOS. In Electron Devices Meeting, 2005. IEDM Technical Digest. IEEE International, December 2005.Google ScholarGoogle Scholar
  25. Shengsheng Huang, Jie Huang, Jinquan Dai, Tao Xie, and Bo Huang. The HiBench benchmark suite: characterization of the MapReduce-based data analysis. In 26th International Conference on Data Engineering Workshops, March 2010.Google ScholarGoogle ScholarCross RefCross Ref
  26. Intel VTune Amplifier XE Performance Profiler. http://software.intel.com/en-us/articles/intel-vtune-amplifier-xe/.Google ScholarGoogle Scholar
  27. Tejas S. Karkhanis and James E. Smith. A first-order superscalar processor model. In Proceedings of the 31st Annual International Symposium on Computer Architecture, June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Kimberly Keeton, David A. Patterson, Yong Qiang He, Roger C. Raphael, and Walter E. Baker. Performance characterization of a quad Pentium Pro SMP using OLTP workloads. In Proceedings of the 25th Annual International Symposium on Computer Architecture, June 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Taeho Kgil, Shaun D'Souza, Ali Saidi, Nathan Binkert, Ronald Dreslinski, Trevor Mudge, Steven Reinhardt, and Krisztian Flautner. PicoServer: using 3D stacking technology to enable a compact energy efficient chip multiprocessor. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, October 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Christos Kozyrakis, Aman Kansal, Sriram Sankar, and Kushagra Vaid. Server engineering insights for large-scale online services. IEEE Micro, 30(4):8--19, July-Aug, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ang Li, Xiaowei Yang, Srikanth Kandula, and Ming Zhang. CloudCmp: comparing public cloud providers. In Proceedings of the 10th Annual Conference on Internet Measurement, November 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ang Li, Xiaowei Yang, Srikanth Kandula, and Ming Zhang. CloudCMP: Shopping for a cloud made easy. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kevin Lim, Parthasarathy Ranganathan, Jichuan Chang, Chandrakant Patel, Trevor Mudge, and Steven Reinhardt. Understanding and designing new server architectures for emerging warehouse-computing environments. In Proceedings of the 35th Annual International Symposium on Computer Architecture, June 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Jack L. Lo, Luiz André Barroso, Susan J. Eggers, Kourosh Gharachorloo, Henry M. Levy, and Sujay S. Parekh. An analysis of database workload performance on simultaneous multithreaded processors. In Proceedings of the 25th Annual International Symposium on Computer Architecture, June 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Open Compute Project. http://opencompute.org/.Google ScholarGoogle Scholar
  36. Parthasarathy Ranganathan, Kourosh Gharachorloo, Sarita V. Adve, and Luiz André Barroso. Performance of database workloads on shared-memory systems with out-of-order processors. In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems, October 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Parthasarathy Ranganathan and Norman Jouppi. Enterprise IT trends and implications for architecture research. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture, February 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Vijay Janapa Reddi, Benjamin C. Lee, Trishul Chilimbi, and Kushagra Vaid. Web search using mobile cores: quantifying and mitigating the price of efficiency. In Proceedings of the 37th Annual International Symposium on Computer Architecture, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. SeaMicro Packs 768 Cores Into its Atom Server. http://www.datacenterknowledge.com/archives/2011/07/18/seamicro-packs-768-cores-into-its-atom-server/.Google ScholarGoogle Scholar
  40. Will Sobel, Shanti Subramanyam, Akara Sucharitakul, Jimmy Nguyen, Hubert Wong, Arthur Klepchukov, Sheetal Patil, Armando Fox, and David Patterson. Cloudstone: multi-platform, multi-language benchmark and measurement tools for web 2.0. In the 1st Workshop on Cloud Computing and Its Applications, October 2008.Google ScholarGoogle Scholar
  41. Vijayaraghavan Soundararajan and Jennifer M. Anderson. The impact of management operations on the virtualized datacenter. In Proceedings of the 37th Annual International Symposium on Computer Architecture, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Lingjia Tang, Jason Mars, Veil Vachharajani, Robert Hundt, and Mary Lou Soffa. The impact of memory subsystem resource sharing on datacenter applications. In Proceeding of the 38th Annual International Symposium on Computer Architecture, June 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. The Apache Cassandra Project. http://cassandra.apache.org/.Google ScholarGoogle Scholar
  44. TPC Transaction Processing Performance Council. http://www.tpc.org/default.asp.Google ScholarGoogle Scholar
  45. Nathan Tuck and Dean M. Tullsen. Initial observations of the simultaneous multithreading Pentium 4 processor. In Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques, September 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Ganesh Venkatesh, Jack Sampson, Nathan Goulding, Saturnino Garcia, Vladyslav Bryksin, Jose Lugo-Martinez, Steven Swanson, and Michael B. Taylor. Conservation cores: reducing the energy of mature computations. In Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, March 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Thomas F. Wenisch, Roland E. Wunderlich, Michael Ferdman, Anastassia Ailamaki, Babak Falsafi, and James C. Hoe. Simflex: Statistical sampling of computer system simulation. IEEE Micro, 26:18--31, July 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Clearing the clouds: a study of emerging scale-out workloads on modern hardware

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ASPLOS XVII: Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
      March 2012
      476 pages
      ISBN:9781450307598
      DOI:10.1145/2150976
      • cover image ACM SIGARCH Computer Architecture News
        ACM SIGARCH Computer Architecture News  Volume 40, Issue 1
        ASPLOS '12
        March 2012
        453 pages
        ISSN:0163-5964
        DOI:10.1145/2189750
        Issue’s Table of Contents
      • cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 47, Issue 4
        ASPLOS '12
        April 2012
        453 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2248487
        Issue’s Table of Contents

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 March 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate535of2,713submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader