ABSTRACT
Clos-based networks including Fat-tree and VL2 are being built in data centers, but existing per-flow based routing causes low network utilization and long latency tail. In this paper, by studying the structural properties of Fat-tree and VL2, we propose a per-packet round-robin based routing algorithm called Digit-Reversal Bouncing (DRB). DRB achieves perfect packet interleaving. Our analysis and simulations show that, compared with random-based load-balancing algorithms, DRB results in smaller and bounded queues even when traffic load approaches 100%, and it uses smaller re-sequencing buffer for absorbing out-of-order packet arrivals. Our implementation demonstrates that our design can be readily implemented with commodity switches. Experiments on our testbed, a Fat-tree with 54 servers, confirm our analysis and simulations, and further show that our design handles network failures in 1-2 seconds and has the desirable graceful performance degradation property.
- M. Al-Fares, A. Loukissas, and A. Vahdat. A Scalable, Commodity Data Center Network Architecture. In SIGCOMM, 2008. Google ScholarDigital Library
- M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: Dynamic Flow Scheduling for Data Center Networks. In NSDI, 2010. Google ScholarDigital Library
- M. Alizadeh, A. Greenberg, D. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. Data Center TCP (DCTCP). In SIGCOMM, 2010. Google ScholarDigital Library
- M. Alizadeh, A. Kabbani, T. Edsall, B. Prabhakar, A. Vahdat, and M. Yasuda. Less is More: Trading a little Bandwidth for Ultra-Low Latency in the Data Center. In NSDI, 2012. Google ScholarDigital Library
- Amazon EC2. http://aws.amazon.com/ec2/.Google Scholar
- T. Benson, A. Anand, A. Akella, and M. Zhang. MicroTE: Fine Grained Traffic Engineering for Data Centers. In CoNEXT, 2011. Google ScholarDigital Library
- B. Calder, J. Wang, A. Ogus, N. Nilakantan, A. Skjolsvold, S. McKelvie, Y. Xu, S. Srivastav, J. Wu, H. Simitci, et al. Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency. In SOSP, 2011. Google ScholarDigital Library
- Cisco. Per-packet load balancing. http://www.cisco.com/en/US/docs/ios/12 0s/feature/guide/pplb.html.Google Scholar
- C. Clos. A Study of Nonblocking Switching Networks. Bell Syst. Tech. J., 32(2), 1953.Google Scholar
- J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI, 2004. Google ScholarDigital Library
- A. Dixit, P. Prakash, Y. C. Hu, and R. R. Kompella. On the Impact of Packet Spraying in Data Center Networks. In INFOCOM, 2013.Google ScholarCross Ref
- A. Greenberg, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. Maltz, P. Patel, and S. Sengupta. VL2: A Scalable and Flexible Data Center Network. In SIGCOMM, 2009. Google ScholarDigital Library
- C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu. BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers. In SIGCOMM, 2009. Google ScholarDigital Library
- C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu. DCell: A Scalable and Fault Tolerant Network Structure for Data Centers. In SIGCOMM, 2008. Google ScholarDigital Library
- J. Hamilton. 42: the answer to the ultimate question of life, the universe, and everything, Nov 2011.Google Scholar
- T. Hoff. Latency is Everywhere and it Costs You Sales - How to Crush it, July 2009. http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it.Google Scholar
- C. Hong, M. Caesar, and P. B. Godfrey. Finishing Flows Quickly with Preemptive Scheduling. In SIGCOMM, 2012. Google ScholarDigital Library
- R. Kohavi and R. Longbotham. Online Epxeriments: Lessons Learned. IEEE Computer, September 2007. Google ScholarDigital Library
- S. Mahapatra and X. Yuan. Load Balancing Mechanisms in Data Center Networks. In CEWIT, Sept 2010.Google Scholar
- R. N. Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat. PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric. In SIGCOMM, 2009. Google ScholarDigital Library
- Juniper Networks. Overview of per-packet load balancing. http://www.juniper.net/techpubs/en US/junos11.2/topics/concept/policy-per-packet-load-balancing-overview.html.Google Scholar
- C. Perkins. IP Encapsulation within IP, Oct 1996. RFC2003. Google ScholarDigital Library
- C. Raiciu, S. Barre, C. Pluntke, A. Greenhalgh, D. Wischik, and M. Handley. Improving Datacenter Performance and Robustness with Multipath TCP. In SIGCOMM, 2011. Google ScholarDigital Library
- S. Sen, D. Shue, S. Ihm, and M. J. Freedman. Scalable, Optimal Flow Routing in Datacenters via Local Link Balancing. In CoNEXT, 2013. Google ScholarDigital Library
- D. Thaler and C. Hopps. Multipath Issues in Unicast and Multicast Next-Hop Selection, Nov 2000. RFC 2991. Google ScholarDigital Library
- C. Wilson, H. Ballani, T. Karagiannis, and A. Rowstron. Better Never than Late: Meeting Deadlines in Datacenter Networks. In SIGCOMM, 2011. Google ScholarDigital Library
- X. Wu, D. Turner, C. Chen, D. Maltz, X. Yang, L. Yuan, and M. Zhang. NetPilot: Automating Datacenter Network Failure Mitigation. In SIGCOMM, 2012. Google ScholarDigital Library
- D. Zats, T. Das, P. Mohan, D Borthakur, and R. Katz. DeTail: Reducing the Flow Completion Time Tail in Datacenter Networks. In SIGCOMM, 2012. Google ScholarDigital Library
Index Terms
- Per-packet load-balanced, low-latency routing for clos-based data center networks
Recommendations
Safe Randomized Load-Balanced Switching By Diffusing Extra Loads
Load-balanced switch architectures are known to be scalable in both size and speed, which is of interest due to the continued exponential growth in Internet traffic. However, the main drawback of load-balanced switches is that packets can depart out of ...
Safe Randomized Load-Balanced Switching by Diffusing Extra Loads
SIGMETRICS '18Load-balanced switch architectures are known to be scalable in both size and speed, which is of interest due to the continued exponential growth in Internet traffic. However, the main drawback of load-balanced switches is that packets can depart out of ...
Low Latency Low Loss Media Delivery Utilizing In-Network Packet Wash
AbstractThis paper presents new techniques and mechanisms for carrying streams of layered video using Scalable Video Coding (SVC) from servers to clients, utilizing the Packet Wash mechanism which is part of the Big Packet Protocol (BPP). BPP was designed ...
Comments