skip to main content
10.1145/2168836.2168870acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Improving network connection locality on multicore systems

Published:10 April 2012Publication History

ABSTRACT

Incoming and outgoing processing for a given TCP connection often execute on different cores: an incoming packet is typically processed on the core that receives the interrupt, while outgoing data processing occurs on the core running the relevant user code. As a result, accesses to read/write connection state (such as TCP control blocks) often involve cache invalidations and data movement between cores' caches. These can take hundreds of processor cycles, enough to significantly reduce performance.

We present a new design, called Affinity-Accept, that causes all processing for a given TCP connection to occur on the same core. Affinity-Accept arranges for the network interface to determine the core on which application processing for each new connection occurs, in a lightweight way; it adjusts the card's choices only in response to imbalances in CPU scheduling. Measurements show that for the Apache web server serving static files on a 48-core AMD system, Affinity-Accept reduces time spent in the TCP stack by 30% and improves overall throughput by 24%.

References

  1. Chelsio Terminator 4 ASIC. White paper, Chelsio Communications, January 2010. http://chelsio.com/assetlibrary/whitepapers/ChelsioT4 Architecture White Paper.pdf.Google ScholarGoogle Scholar
  2. Apache HTTP Server, October 2011. http://httpd.apache.org/.Google ScholarGoogle Scholar
  3. Httperf, October 2011. http://www.hpl.hp.com/research/linux/httperf/.Google ScholarGoogle Scholar
  4. Lighttpd Server, October 2011. http://www.lighttpd.net/.Google ScholarGoogle Scholar
  5. Receive Side Scaling, October 2011. http://technet.microsoft.com/en-us/network/dd277646.Google ScholarGoogle Scholar
  6. SMP and Lighttpd, October 2011. http://redmine.lighttpd.net/wiki/1/Docs:MultiProcessor.Google ScholarGoogle Scholar
  7. SpecWeb2009, October 2011. http://www.spec.org/web2009/.Google ScholarGoogle Scholar
  8. AMD, Inc. Six-core AMD opteron processor features. http://www.amd.com/us/products/server/processors/six-core-opteron/Pages/six-core-opteron-key-architectural-features.aspx.Google ScholarGoogle Scholar
  9. S. Boyd-Wickizer, A. T. Clements, Y. Mao, A. Pesterev, M. F. Kaashoek, R. Morris, and N. Zeldovich. An Analysis of Linux Scalability to Many Cores. In Proc. OSDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Dobrescu, N. Egi, K. Argyraki, B.-G. Chun, K. Fall, G. Iannaccone, A. Knies, M. Manesh, and S. Ratnasamy. Route-Bricks: Exploiting Parallelism To Scale Software Routers. In Proc. SOSP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T. Herbert. RFS: Receive Flow Steering, October 2011. http://lwn.net/Articles/381955/.Google ScholarGoogle Scholar
  12. T. Herbert. RPS: Receive Packet Steering, October 2011. http://lwn.net/Articles/361440/.Google ScholarGoogle Scholar
  13. T. Herbert. aRFS: Accelerated Receive Flow Steering, January 2012. http://lwn.net/Articles/406489/.Google ScholarGoogle Scholar
  14. Intel. 82599 10 GbE Controller Datasheet, October 2011. http://download.intel.com/design/network/datashts/82599 datasheet.pdf.Google ScholarGoogle Scholar
  15. Linux 3.2.2 Myricom driver source code, January 2012. drivers/net/ethernet/myricom/myri10ge/myri10ge.c.Google ScholarGoogle Scholar
  16. Linux 3.2.2 Solarflare driver source code, January 2012. drivers/net/ethernet/sfc/regs.h.Google ScholarGoogle Scholar
  17. G. Lu, C. Guo, Y. Li, Z. Zhou, T. Yuan, H. Wu, Y. Xiong, R. Gao, and Y. Zhang. ServerSwitch: A Programmable and High Performance Platform for Data Center Networks. In Proc. NSDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. E. M. Nahum, D. J. Yates, J. F. Kurose, and D. Towsley. Performance issues in parallelized network protocols. In Proc. OSDI, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Pesterev, N. Zeldovich, and R. T. Morris. Locating cache performance bottlenecks using data profiling. In Proc. EuroSys, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Robert Watson. Packet Steering in FreeBSD, January 2012. http://freebsd.1045724.n5.nabble.com/Packet-steering-SMP-td4250398.html.Google ScholarGoogle Scholar
  21. L. Soares and M. Stumm. FlexSC: flexible system call scheduling with exception-less system calls. In Proc. OSDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sunay Tripathi. FireEngine: A new networking architecture for the Solaris operating system. White paper, Sun Microsystems, June 2004.Google ScholarGoogle Scholar
  23. P. Willmann, S. Rixner, and A. L. Cox. An evaluation of network stack parallelization strategies in modern operating systems. In Proc. USENIX ATC, June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. J. Yates, E. M. Nahum, J. F. Kurose, and D. Towsley. Networking support for large scale multiprocessor servers. In Proc. SIGMETRICS, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Improving network connection locality on multicore systems

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            EuroSys '12: Proceedings of the 7th ACM european conference on Computer Systems
            April 2012
            394 pages
            ISBN:9781450312233
            DOI:10.1145/2168836

            Copyright © 2012 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 10 April 2012

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate241of1,308submissions,18%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader