skip to main content
article

Measurement, modeling, and analysis of a peer-to-peer file-sharing workload

Published:19 October 2003Publication History
Skip Abstract Section

Abstract

Peer-to-peer (P2P) file sharing accounts for an astonishing volume of current Internet traffic. This paper probes deeply into modern P2P file sharing systems and the forces that drive them. By doing so, we seek to increase our understanding of P2P file sharing workloads and their implications for future multimedia workloads. Our research uses a three-tiered approach. First, we analyze a 200-day trace of over 20 terabytes of Kazaa P2P traffic collected at the University of Washington. Second, we develop a model of multimedia workloads that lets us isolate, vary, and explore the impact of key system parameters. Our model, which we parameterize with statistics from our trace, lets us confirm various hypotheses about file-sharing behavior observed in the trace. Third, we explore the potential impact of locality-awareness in Kazaa.Our results reveal dramatic differences between P2P file sharing and Web traffic. For example, we show how the immutability of Kazaa's multimedia objects leads clients to fetch objects at most once; in contrast, a World-Wide Web client may fetch a popular page (e.g., CNN or Google) thousands of times. Moreover, we demonstrate that: (1) this "fetch-at-most-once" behavior causes the Kazaa popularity distribution to deviate substantially from Zipf curves we see for the Web, and (2) this deviation has significant implications for the performance of multimedia file-sharing systems. Unlike the Web, whose workload is driven by document change, we demonstrate that clients' fetch-at-most-once behavior, the creation of new objects, and the addition of new clients to the system are the primary forces that drive multimedia workloads such as Kazaa. We also show that there is substantial untapped locality in the Kazaa workload. Finally, we quantify the potential bandwidth savings that locality-aware P2P file-sharing architectures would achieve.

References

  1. S. Acharya, B. Smith, and P. Parnes. Characterizing user access to videos on the World Wide Web. In Proceedings of ACM/SPIE Multimedia Computing and Networking, January 2000.]]Google ScholarGoogle Scholar
  2. E. Adar and B. Huberman. Free riding on Gnutella. In First Monday, 5(10), October 2000. http://www.firstmonday.dk/issues/issue5_10/adar/.]]Google ScholarGoogle Scholar
  3. J. Almeida, J. Krueger, D. Eager, and M. Vernon. Analysis of educational media server workloads. In Proceedings of the 11th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSDAV '01), Port Jefferson, NY, June 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. V. A. F. Almeida, M. G. Cesario, R. C. Fonseca, W. M. Jr., and C. D. Murta. Analyzing the behavior of a proxy server in light of regional and cultural issues. In Proceedings of the Third International WWW Caching Workshop, Manchester, England, June 1998. http://hermes.wwwcache.ja.net/events/workshop/.]]Google ScholarGoogle Scholar
  5. P. Barford and M. Crovella. Generating representative Web workloads for network and server performance evaluation. In Proceedings of the ACM SIGMETRICS '98, Madison, WI, June 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Bhagwan, S. Savage, and G. Voelker. Understanding availability. In Proceedings of the 2nd International Workshop on Peer-to-peer Systems, Berkeley, CA, December 2002.]]Google ScholarGoogle Scholar
  7. L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. In Proceedings of IEEE INFOCOM 1999, March 1999.]]Google ScholarGoogle ScholarCross RefCross Ref
  8. L. Cherkasova and G. Ciardo. Characterizing locality, evolution, and life span of accesses in enterprise media server workloads. In Proceedings of the 12th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSDAV '02), Miami Beach, FL, May 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. E. Crovella and A. Bestavros. Self-similarity in world wide Web traffic: Evidence and possible causes. IEEE/ACM Transactions on Networking, 5(6):835--846, December 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Dan, D. Sitaram, and P. Shahabuddin. Scheduling policies for an on-demand video server with batching. In Proceedings of ACM Multimedia 1994, October 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Douglis, A. Feldmann, B. Krishnamurthy, and J. C. Mogul. Rate of change and other metrics: a live study of the World Wide Web. In Proceedings of the 1997 USENIX Symposium on Internet Technologies and Systems, Dec. 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. P. Doyle, J. S. Chase, S. Gadde, and A. M. Vahdat. The trickle-down effect: Web caching and server request distribution. In Proceedings of the Sixth International Workshop on Web Caching and Content Delivery, Boston, MA, June 2000.]]Google ScholarGoogle Scholar
  13. P. Francis, S. Jamin, C. Jin, Y. Jin, D. Raz, Y.Shavitt, and L. Zhang. IDMAPS: a global Internet host distance estimation service. In IEEE/ACM Transactions on Networking, October 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Gadde, J. Chase, and M. Rabinovich. Web caching and content distribution: A view from the interior. In Proc. of the 5th International Web Caching and Content Delivery Workshop, May 2000.]]Google ScholarGoogle Scholar
  15. Z. Ge, D. R. Figueiredo, S. Jaiswal, J. Kurose, and D. Towsley. Modeling peer-peer file sharing systems. In Proceedings of INFOCOM 2003, Santa Fe, NM, October 2003.]]Google ScholarGoogle ScholarCross RefCross Ref
  16. C. Griwodz, M. Bar, and L. C. Wolf. Long-term movie popularity models in video-on-demand systems. In Proceedings of ACM Multimedia 1997, Seattle, WA, November 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K. P. Gummadi, S. Saroiu, and S. D. Gribble. King: Estimating latency between arbitrary internet end hosts. In Proceedings of the Second SIGCOMM Internet Measurement Workshop (IMW 2002), Marseille, France, November 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. K. A. Hua and S. Sheu. Skyscraper broadcasting: A new broadcasting scheme for metropolitan video-on-demand systems. In Proceedings of ACM SIGCOMM 1997, Cannes, France, September 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kazaa. Homepage http://www.kazaa.com, July 2003.]]Google ScholarGoogle Scholar
  20. Keynote Systems Inc. Homepage at http://www.keynote.com, July 2003.]]Google ScholarGoogle Scholar
  21. J. Ledlie, J. Taylor, L. Serban, and M. Seltzer. Self-organization in peer-to-peer systems. In Proceedings of the 2002 SIGOPS European Workshop, St. Emilion, France, September 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. N. Leibowitz, A. Bergman, R. Ben-Shaul, and A. Shavit. Are file swapping networks cacheable? Characterizing P2P traffic. In Proc. of the 7th Int. WWW Caching Workshop, August 2002.]]Google ScholarGoogle Scholar
  23. D. Liben-Nowell, H. Balakrishnan, and D. Karger. Analysis of the evolution of peer-to-peer networks. In Proceedings of 2002 ACM Conference on the Principles of Distributed Computing, Monterey, CA, July 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. McCanne and V. Jacobson. The BSD packet filter: A new architecture for user-level packet capture. In Proceedings of the Winter USENIX Conference, pages 259--270, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. E. Ng and H. Zhang. Predicting Internet network distance with coordinates-based approaches. In Proceedings of IEEE INFOCOM 2002, NewYork, NY, June 2002.]]Google ScholarGoogle ScholarCross RefCross Ref
  26. Nielsen Netratings, Inc., August 2003. http://www.nielsen-netratings.com.]]Google ScholarGoogle Scholar
  27. V. N. Padmanabhan and L. Qiu. The content and access dynamics of a busy Web site: Findings and implications. In Proceedings of ACM SIGCOMM 2000, August 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Plonka. University of Wisconsin-Madison, Napster traffic measurement, March 2000. Available at http://net.doit.wisc.edu/data/Napster, March 2000.]]Google ScholarGoogle Scholar
  29. S. Saroiu, K. P. Gummadi, R. J. Dunn, S. D. Gribble, and H. M. Levy. An analysis of internet content delivery systems. In Proceedings of the Fifth Symposium on Operating Systems Design and Implementation (OSDI 2002), Boston, MA, December 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. Saroiu, P. K. Gummadi, and S. D. Gribble. A measurement study of peer-to-peer file sharing systems. In Proceedings of Multimedia Computing and Networking (MMCN) 2002, January 2002.]]Google ScholarGoogle Scholar
  31. J. Segarra and V. Cholvi. Distribution of video-on-demand in residential networks. Lecture Notes in Computer Science, 2158:50--61, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Sen and J. Wang. Analyzing peer-to-peer traffic across large networks. In Proceedings of the Second SIGCOMM Internet Measurement Workshop (IMW 2002), Marseille, France, November 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. W. Tang, Y. Fu, L. Cherkasova, and A. Vahdat. Long-term streaming media server workload analysis and modeling. Technical Report HPL-2003-23, HP Laboratories, January 2003.]]Google ScholarGoogle Scholar
  34. The Internet Movie Database, August 2003. http://www.imdb.com.]]Google ScholarGoogle Scholar
  35. Video Store Magazine, March 2000. Published by Avanstar Communications, http://www.videostoremag.com.]]Google ScholarGoogle Scholar
  36. L. Wang, V. Pai, and L. Peterson. The effectiveness of request redirection on CDN robustness. In Proceedings of the Fifth Symposium on Operating Systems Design and Implementation (OSDI 2002), Boston, MA, December 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. A. Wolman, G. Voelker, N. Sharma, N. Cardwell, M. Brown, T. Landray, D. Pinnel, A. Karlin, and H. Levy. Organization-based analysis of Web-object sharing and caching. In Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems, Oct. 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. A. Wolman, G. Voelker, N. Sharma, N. Cardwell, A. Karlin, and H. Levy. The scale and performance of cooperative Web proxy caching. In Proceedings of the 17th ACM Symposium on Operating Systems Principles, Dec. 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Measurement, modeling, and analysis of a peer-to-peer file-sharing workload

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM SIGOPS Operating Systems Review
            ACM SIGOPS Operating Systems Review  Volume 37, Issue 5
            SOSP '03
            December 2003
            329 pages
            ISSN:0163-5980
            DOI:10.1145/1165389
            Issue’s Table of Contents
            • cover image ACM Conferences
              SOSP '03: Proceedings of the nineteenth ACM symposium on Operating systems principles
              October 2003
              338 pages
              ISBN:1581137575
              DOI:10.1145/945445

            Copyright © 2003 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 19 October 2003

            Check for updates

            Qualifiers

            • article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader