skip to main content
article
Free Access

Self-similarity in World Wide Web traffic: evidence and possible causes

Published:15 May 1996Publication History
Skip Abstract Section

Abstract

Recently the notion of self-similarity has been shown to apply to wide-area and local-area network traffic. In this paper we examine the mechanisms that give rise to the self-similarity of network traffic. We present a hypothesized explanation for the possible self-similarity of traffic by using a particular subset of wide area traffic: traffic due to the World Wide Web (WWW). Using an extensive set of traces of actual user executions of NCSA Mosaic, reflecting over half a million requests for WWW documents, we examine the dependence structure of WWW traffic. While our measurements are not conclusive, we show evidence that WWW traffic exhibits behavior that is consistent with self-similar traffic models. Then we show that the self-similarity in such traffic can be explained based on the underlying distributions of WWW document sizes, the effects of caching and user preference in file transfer, the effect of user "think time", and the superimposition of many such transfers in a local area network. To do this we rely on empirically measured distributions both from our traces and from data independently collected at over thirty WWW sites.

References

  1. 1 Martin F. Arlitt and Carey L. Williamson. Web server workload characterization: The search for invariants. In Proceedings of the 1996 SIGMETRICS Conference on Measurement and Modeling of Computer Systems, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2 Jan Beran. Statistics for Long-Memory Processes. Monographs on Statistics and Applied Probability. Chapman and Hall, New York, NY, 1994.]]Google ScholarGoogle Scholar
  3. 3 T. Berners-Lee, L. Masinter, and M.McCahill. Uniform resource locators. RFC 1738, December 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4 Peter J. Brockwell and Richard A. Davis. Time Series: Theory and Methods. Springer Series in Statistics. Springer-Verlag, second edition, 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5 Lara D. Catledge and James E. Pitkow. Characterizing browsing strategies in the World-Wide Web. In Proceedings of the Third WWW Conference, 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6 Netscape Communications Corp. Netscape Navigator software. Available from http://w-~.netscape, com.]]Google ScholarGoogle Scholar
  7. 7 Mark E. Crovella and Azer Bestavros. Explaining world wide web traffic self-similarity. Technical Report TR-95-015 (Revised), Boston University Department of Computer Science, October 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8 Carlos R. Cunha, A2er Bostavros, and Mark E. Crovella. Characteristics of WWW client-based traces. Technical Report P,U- CS-95-010, Boston University Computer Science Department, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9 National Center for Supercomputing Applications. Mosaic software. Available at ftp://ftp.ncsa, uiuc. edu/Mosaic.]]Google ScholarGoogle Scholar
  10. 10 Steven Glassman. A Caching Relay for the World Wide Web. In First International Conference on the World-Wide Web, CERN, Geneva (Switzerland)~ May 1994. Elsevier Science.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11 B. M. Hill. A simple general approach to inference about the tail of a distribution. The Annals of Statistics, 3:1163-1174, 1975.]]Google ScholarGoogle ScholarCross RefCross Ref
  12. 12 Merit Network Inc. NSF Network statistics. Available at ftp :- //nis.nsf.net/statistics/nsfnet/, December 1994.]]Google ScholarGoogle Scholar
  13. 13 Gordon Irlam. Unix file size survey- 1993. Available at http ://www. base. com/gordoni/ufs93 .hCml, September 199,1.]]Google ScholarGoogle Scholar
  14. 14 W. LeIand, M. Taqqu, W. Willinger, and D. Wilson. On the self-similar nature of Ethernet traffic. In Proceedings of SIG- COMM '93, pages 183-193, September 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15 W. E. Leland and D. V. Wilson. High time-resolution measurement and analysis of LAN traffic: Implications for LAN interconnection. In Proceeedings of IEEE lnfocomm '91, pages 1360-1366, Bal Harbour, FL, 1991.]]Google ScholarGoogle Scholar
  16. 16 W.E. Leland, M.S. Taqqu, W. Willinger, and D.V. Wilson. (:)n the self-similar nature of Ethernet traffic (extended version). IEEE/ACM Transactions on Networking, 2:1-15, 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 Benoit B. Mandelbrot. Long-run linearity, locally Gaussian processes, H-spectra and infinite variances. Intern. Econom. Rev.~ 10:82-113, 1969.]]Google ScholarGoogle Scholar
  18. 18 Benoit B. Mandelbrot. The Fractal Geometry of Nature. W. H. Freedman and Co., New York, 1983.]]Google ScholarGoogle Scholar
  19. 19 Vern Paxson. Empirically-derived analytic models of wide-axea TCP connections. IEEE/ACM Transactions on Networking, 2(4):316-336, August 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20 Vern Paxson and Sally Floyd. Wide-area traffic: The failure of poisson modeling. In Proceedings o/SIGCOMM '9~, 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21 James E. Pitkow and Margaret M. Recker. A Simple Yet Robust Caching Algorithm Based on Dynamic Access Patterns, In Electronic Prec. of the 2nd WWW Conference, 1994.]]Google ScholarGoogle Scholar
  22. 22 Regents of the University of California. www-stat 1.0 software. Available from http://~, ics .uci. edu/WebSoft/~stat/.]]Google ScholarGoogle Scholar
  23. 23 Jeff Sedayao. "Mosaic Will Kill My Network!" - Studying Network Traffic Patterns of Mosaic Use. In Electronic Proceedings of the Second World Wide Web Conference '95: Mosaic and the Web, Chicago, illinois, October 1994.]]Google ScholarGoogle Scholar
  24. 24 M. S. Taqqu, V. Teverovsky, and W. Willinger. Estimators for long-range dependence: an empirical study, 1995. Preprint.]]Google ScholarGoogle Scholar
  25. 25 Murad Taqqu. Personal communication.]]Google ScholarGoogle Scholar
  26. 26 Murad S. Taqqu and Joshua B. Levy. Using renewal processes to generate long-range dependence and high variability. In Ernst Eberlein and Murad S. Taqqu, editors, Dependence in Probability and Statistics, pages 73-90. Birkhauser, 1986.]]Google ScholarGoogle Scholar
  27. 27 Walter Willinger, Murad S. Taqqu, Will E. Leland, and Daniel V. Wilson. Self-similarity in high-speed packet traf tic: Analysis and modeling of Ethernet traffic measurements. Statistical Science, 10(1):67-85, 1995.]]Google ScholarGoogle ScholarCross RefCross Ref
  28. 28 Walter Willinger, Murad S. Taqqu, Robert Sherman, and Daniel V. Wilson. Self-similarity through high-variability: Statistical analysis of Ethernet LAN traffic at the source level. In Proceedings of SIGCOMM '95, pages 100-113, Boston, MA, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Self-similarity in World Wide Web traffic: evidence and possible causes

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGMETRICS Performance Evaluation Review
          ACM SIGMETRICS Performance Evaluation Review  Volume 24, Issue 1
          May 1996
          273 pages
          ISSN:0163-5999
          DOI:10.1145/233008
          Issue’s Table of Contents
          • cover image ACM Conferences
            SIGMETRICS '96: Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
            May 1996
            279 pages
            ISBN:0897917936
            DOI:10.1145/233013

          Copyright © 1996 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 15 May 1996

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader