skip to main content
10.1145/1669112.1669165acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article

Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems

Published:12 December 2009Publication History

ABSTRACT

Across a broad range of applications, multicore technology is the most important factor that drives today's microprocessor performance improvements. Closely coupled is a growing complexity of the memory subsystems with several cache levels that need to be exploited efficiently to gain optimal application performance. Many important implementation details of these memory subsystems are undocumented. We therefore present a set of sophisticated benchmarks for latency and bandwidth measurements to arbitrary locations in the memory subsystem. We consider the coherency state of cache lines to analyze the cache coherency protocols and their performance impact. The potential of our approach is demonstrated with an in-depth comparison of ccNUMA multiprocessor systems with AMD (Shanghai) and Intel (Nehalem-EP) quad-core x86-64 processors that both feature integrated memory controllers and coherent point-to-point interconnects. Using our benchmarks we present fundamental memory performance data and architectural properties of both processors. Our comparison reveals in detail how the microarchitectural differences tremendously affect the performance of the memory subsystem.

References

  1. SPEC CPU2006 published results page: http://www.spec.org/cpu2006/results/.Google ScholarGoogle Scholar
  2. AMD. AMD64 Architecture Programmer's Manual Volume 2: System Programming, revision: 3.14 edition, September 2007. Publication # 24593.Google ScholarGoogle Scholar
  3. AMD. Software Optimization Guide For AMD Family 10h Processors, revision: 3.04 edition, September 2007. Publication # 40546.Google ScholarGoogle Scholar
  4. V. Babka and P. Tůma. Investigating cache parameters of x86 family processors. In SPEC Benchmark Workshop, pages 77--96, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Conway and B. Hughes. The AMD opteron northbridge architecture. Micro, IEEE, 27(2):10--21, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Dorsey, S. Searles, M. Ciraula, S. Johnson, N. Bujanos, D. Wu, M. Braganza, S. Meyers, E. Fang, and R. Kumar. An integrated quad-core opteron processor. In IEEE International Solid-State Circuits Conference, pages 102--103, 2007.Google ScholarGoogle Scholar
  7. J. L. Hennessy and D. A. Patterson. Computer Architecture: A Quantitative Approach. Fourth edition, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Intel. Intel 64 and IA-32 Architectures Optimization Reference Manual, March 2009.Google ScholarGoogle Scholar
  9. Intel. An Introduction to the Intel QuickPath Interconnect, January 2009.Google ScholarGoogle Scholar
  10. G. Juckeland, S. Börner, M. Kluge, S. Kölling, W. E. Nagel, S. Pflüger, and H. Röding. BenchIT - performance measurements and comparison for scientific applications. In PARCO, pages 501--508, 2003.Google ScholarGoogle Scholar
  11. J. D. McCalpin. Memory bandwidth and machine balance in current high performance computers. IEEE Computer Society Technical Committee on Computer Architecture Newsletter, pages 19--25, December 1995.Google ScholarGoogle Scholar
  12. L. Peng, J.-K. Peir, T. K. Prakash, C. Staelin, Y.-K. Chen, and D. Koppelman. Memory hierarchy performance measurement of commercial dual-core desktop processors. Journal of Systems Architecture, 54(8):816--828, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          MICRO 42: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
          December 2009
          601 pages
          ISBN:9781605587981
          DOI:10.1145/1669112

          Copyright © 2009 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 12 December 2009

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate484of2,242submissions,22%

          Upcoming Conference

          MICRO '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader