ABSTRACT
To serve users quickly, Web service providers build infrastructure closer to clients and use multi-stage transport connections. Although these changes reduce client-perceived round-trip times, TCP's current mechanisms fundamentally limit latency improvements. We performed a measurement study of a large Web service provider and found that, while connections with no loss complete close to the ideal latency of one round-trip time, TCP's timeout-driven recovery causes transfers with loss to take five times longer on average.
In this paper, we present the design of novel loss recovery mechanisms for TCP that judiciously use redundant transmissions to minimize timeout-driven recovery. Proactive, Reactive, and Corrective are three qualitatively-different, easily-deployable mechanisms that (1) proactively recover from losses, (2) recover from them as quickly as possible, and (3) reconstruct packets to mask loss. Crucially, the mechanisms are compatible both with middleboxes and with TCP's existing congestion control and loss recovery. Our large-scale experiments on Google's production network that serves billions of flows demonstrate a 23% decrease in the mean and 47% in 99th percentile latency over today's TCP.
- Web Page Replay. http://code.google.com/p/web-page-replay/.Google Scholar
- Akamai. The State of the Internet (3rd Quarter 2012), 2012. http://www.akamai.com/stateoftheinternet/.Google Scholar
- M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. Data center TCP (DCTCP). In Proc. of SIGCOMM, 2010. Google ScholarDigital Library
- M. Allman, K. Avrachenkov, U. Ayesta, J. Blanton, and P. Hurtig. Early retransmit for TCP and SCTP, May 2010. RFC 5827.Google Scholar
- M. Allman, H. Balakrishnan, and S. Floyd. Enhancing TCP's Loss Recovery Using Limited Transmit, January 2001. RFC 3042. Google ScholarDigital Library
- M. Allman, V. Paxson, and E. Blanton. TCP congestion control, September 2009. RFC 5681.Google Scholar
- H. Balakrishnan, V. N. Padmanabhan, S. Seshan, M. Stemm, and R. H. Katz. TCP Behavior of a Busy Internet Server: Analysis and Improvements. In Proc. of INFOCOM, 1998.Google ScholarCross Ref
- M. Balakrishnan, T. Marian, K. P. Birman, H. Weatherspoon, and L. Ganesh. Maelstrom: transparent error correction for communication between data centers. IEEE/ACM Trans. Netw., 19(3), June 2011. Google ScholarDigital Library
- L. Baldantoni, H. Lundqvist, and G. Karlsson. Adaptive end-to-end FEC for improving TCP performance over wireless links. In Proc. of Conf. on Commun., June 2004.Google ScholarCross Ref
- E. Blanton and M. Allman. Using TCP DSACKs and SCTP duplicate TSNs to detect spurious retransmissions, February 2004. RFC 3708.Google Scholar
- E. Blanton, M. Allman, L. Wang, I. Jarvinen, M. Kojo, and Y. Nishida. A Conservative Loss Recovery Algorithm Based on Selective Acknowledgment (SACK) for TCP, 2012. RFC 6675.Google Scholar
- L. Brakmo, S. O'Malley, and L. Peterson. TCP Vegas: End to End Congestion Avoidance on a Global Internet. ACM Comput. Commun. Rev., August 1996.Google Scholar
- M. Carbone and L. Rizzo. Dummynet revisited. ACM Comput. Commun. Rev., 40(2), 2010. Google ScholarDigital Library
- N. Dukkipati. tcp: Tail Loss Probe (TLP). http://lwn.net/Articles/542642/.Google Scholar
- N. Dukkipati, N. Cardwell, Y. Cheng, and M. Mathis. Tail Loss Probe (TLP): An Algorithm for Fast Recovery of Tail Losses, Feburary 2013. draft-dukkipati-tcpm-tcp-loss-probe-01.Google Scholar
- N. Dukkipati, T. Refice, Y. Cheng, J. Chu, T. Herbert, A. Agarwal, A. Jain, and N. Sutin. An Argument for Increasing TCP's Initial Congestion Window. ACM Comput. Commun. Rev., 40, 2010. Google ScholarDigital Library
- C. Griwodz and P. Halvorsen. The fun of using TCP for an MMORPG. In Proc. of NOSSDAV, 2006. Google ScholarDigital Library
- S. Ha, I. Rhee, and L. Xu. CUBIC: a new TCP-friendly high-speed TCP variant. SIGOPS Oper. Syst. Rev., 42(5), July 2008. Google ScholarDigital Library
- D. Han, A. Anand, A. Akella, and S. Seshan. RPT: Re-architecting Loss Protection for Content-Aware Networks. In Proc. of NSDI, 2012. Google ScholarDigital Library
- J. Hoe. Improving the start-up behavior of a congestion control scheme for TCP. ACM Comput. Commun. Rev., August 1996. Google ScholarDigital Library
- M. Honda, Y. Nishida, C. Raiciu, A. Greenhalgh, M. Handley, and H. Tokuda. Is it still possible to extend TCP? In Proc. of IMC, 2011. Google ScholarDigital Library
- A. Hughes, J. Touch, and J. Heidemann. Issues in TCP Slow-Start Restart after Idle, December 2001. draft-hughes-restart-00.Google Scholar
- M. Kim, J. Cloud, A. ParandehGheibi, L. Urbina, K. Fouli, D. Leith, and M. Medard. Network Coded TCP (CTCP). arXiv:1212.2291.Google Scholar
- R. Krishnan, H. V. Madhyastha, S. Jain, S. Srinivasan, A. Krishnamurthy, T. Anderson, and J. Gao. Moving Beyond End-to-End Path Information to Optimize CDN Performance. In Proc. of IMC, 2009. Google ScholarDigital Library
- D. Lin and H. Kung. TCP fast recovery strategies: Analysis and improvements. In Proc. of INFOCOM, 1998.Google ScholarCross Ref
- G. Linden. Make Data Useful. http://sites.google.com/site/glinden/Home/StanfordDataMining.2006--11--28%.ppt, 2006.Google Scholar
- R. Ludwig and R. H. Katz. The Eifel Algorithm: Making TCP Robust Against Spurious Retransmissions. (ACM) Comp. Commun. Rev., 30(1), January 2000. Google ScholarDigital Library
- M. Mathis. Relentless Congestion Control, March 2009. draft-mathis-iccrg-relentless-tcp-00.txt.Google Scholar
- M. Mathis and J. Mahdavi. Forward acknowledgment: refining TCP congestion control. ACM Comput. Commun. Rev., 26(4), August 1996. Google ScholarDigital Library
- A. Mondal and A. Kuzmanovic. Removing exponential backoff from TCP. ACM Comput. Commun. Rev., 38(5), September 2008. Google ScholarDigital Library
- A. Petlund, K. Evensen, C. Griwodz, and P. Halvorsen. TCP enhancements for interactive thin-stream applications. In Proc. of NOSSDAV, 2008. Google ScholarDigital Library
- S. Radhakrishnan, Y. Cheng, J. Chu, A. Jain, and B. Raghavan. TCP Fast Open. In Proc. of CoNEXT, 2011. Google ScholarDigital Library
- B. Raghavan and A. Snoeren. Decongestion Control. In Proc. of HotNets, 2006.Google Scholar
- K. Ramakrishnan, S. Floyd, and D. Black. The Addition of Explicit Congestion Notification (ECN) to IP, September 2001. RFC 3042.Google Scholar
- I. Reed and G. Solomon. Polynomial Codes over Certain Finite Fields. Journ. of the Soc. for Industr. and Appl. Math., 8(2), jun 1960.Google Scholar
- S. Rewaskar, J. Kaur, and F. D. Smith. A performance study of loss detection/recovery in real-world TCP implementations. Proc. of ICNP, 2007.Google ScholarCross Ref
- P. Sarolahti, M. Kojo, K. Yamamoto, and M. Hata. Forward RTO-Recovery (F-RTO): An Algorithm for Detecting Spurious Retransmission Timeouts with TCP, September 2009. RFC 5682.Google Scholar
- P. Sarolahti and A. Kuznetsov. Congestion Control in Linux TCP. In Proc. of USENIX, 2002. Google ScholarDigital Library
- R. Scheffenegger. Improving SACK-based loss recovery for TCP, November 2010. draft-scheffenegger-tcpm-sack-loss-recovery-00.txt.Google Scholar
- P. Sun, M. Yu, M. J. Freedman, and J. Rexford. Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring. In SIGCOMM Workshop on Meas. Up the Stack, August 2011. Google ScholarDigital Library
- J. Sundararajan, D. Shah, M. Medard, S. Jakubczak, M. Mitzenmacher, and J. Barros. Network Coding Meets TCP: Theory and Implementation. Proc. of the IEEE, 99(3), March 2011.Google ScholarCross Ref
- S. Sundaresan, W. de Donato, N. Feamster, R. Teixeira, S. Crawford, and A. Pescapè. Broadband Internet Performance: A View from the Gateway. ACM Comput. Commun. Rev., 41(4), 2011. Google ScholarDigital Library
- O. Tickoo, V. Subramanian, S. Kalyanaraman, and K. Ramakrishnan. LT-TCP: End-to-End Framework to improve TCP Performance over Networks with Lossy Channels. In Proc. of IWQoS, 2005. Google ScholarDigital Library
- A. Vulimiri, O. Michel, P. B. Godfrey, and S. Shenker. More is less: reducing latency via redundancy. In Proc. of HotNets, 2012. Google ScholarDigital Library
- M. Walfish, M. Vutukuru, H. Balakrishnan, D. Karger, and S. Shenker. DDoS defense by offense. In Proc. of SIGCOMM, 2006. Google ScholarDigital Library
- D. Zats, T. Das, P. Mohan, D. Borthakur, and R. Katz. DeTail: reducing the flow completion time tail in datacenter networks. In Proc. of SIGCOMM, 2012. Google ScholarDigital Library
Index Terms
- Reducing web latency: the virtue of gentle aggression
Recommendations
Reducing web latency: the virtue of gentle aggression
To serve users quickly, Web service providers build infrastructure closer to clients and use multi-stage transport connections. Although these changes reduce client-perceived round-trip times, TCP's current mechanisms fundamentally limit latency ...
An argument for increasing TCP's initial congestion window
TCP flows start with an initial congestion window of at most four segments or approximately 4KB of data. Because most Web transactions are short-lived, the initial congestion window is a critical TCP parameter in determining how quickly flows can ...
TCP-Illinois: a loss and delay-based congestion control algorithm for high-speed networks
valuetools '06: Proceedings of the 1st international conference on Performance evaluation methodolgies and toolsWe introduce a new congestion control algorithm, called TCP-Illinois, which has many desirable properties for implementation in (very) high-speed networks. TCP-Illinois is a sender side protocol, which modifies the AIMD algorithm of the standard TCP (...
Comments