Abstract
Peer-to-peer (P2P) file sharing accounts for an astonishing volume of current Internet traffic. This paper probes deeply into modern P2P file sharing systems and the forces that drive them. By doing so, we seek to increase our understanding of P2P file sharing workloads and their implications for future multimedia workloads. Our research uses a three-tiered approach. First, we analyze a 200-day trace of over 20 terabytes of Kazaa P2P traffic collected at the University of Washington. Second, we develop a model of multimedia workloads that lets us isolate, vary, and explore the impact of key system parameters. Our model, which we parameterize with statistics from our trace, lets us confirm various hypotheses about file-sharing behavior observed in the trace. Third, we explore the potential impact of locality-awareness in Kazaa.Our results reveal dramatic differences between P2P file sharing and Web traffic. For example, we show how the immutability of Kazaa's multimedia objects leads clients to fetch objects at most once; in contrast, a World-Wide Web client may fetch a popular page (e.g., CNN or Google) thousands of times. Moreover, we demonstrate that: (1) this "fetch-at-most-once" behavior causes the Kazaa popularity distribution to deviate substantially from Zipf curves we see for the Web, and (2) this deviation has significant implications for the performance of multimedia file-sharing systems. Unlike the Web, whose workload is driven by document change, we demonstrate that clients' fetch-at-most-once behavior, the creation of new objects, and the addition of new clients to the system are the primary forces that drive multimedia workloads such as Kazaa. We also show that there is substantial untapped locality in the Kazaa workload. Finally, we quantify the potential bandwidth savings that locality-aware P2P file-sharing architectures would achieve.
- S. Acharya, B. Smith, and P. Parnes. Characterizing user access to videos on the World Wide Web. In Proceedings of ACM/SPIE Multimedia Computing and Networking, January 2000.]]Google Scholar
- E. Adar and B. Huberman. Free riding on Gnutella. In First Monday, 5(10), October 2000. http://www.firstmonday.dk/issues/issue5_10/adar/.]]Google Scholar
- J. Almeida, J. Krueger, D. Eager, and M. Vernon. Analysis of educational media server workloads. In Proceedings of the 11th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSDAV '01), Port Jefferson, NY, June 2001.]] Google ScholarDigital Library
- V. A. F. Almeida, M. G. Cesario, R. C. Fonseca, W. M. Jr., and C. D. Murta. Analyzing the behavior of a proxy server in light of regional and cultural issues. In Proceedings of the Third International WWW Caching Workshop, Manchester, England, June 1998. http://hermes.wwwcache.ja.net/events/workshop/.]]Google Scholar
- P. Barford and M. Crovella. Generating representative Web workloads for network and server performance evaluation. In Proceedings of the ACM SIGMETRICS '98, Madison, WI, June 1998.]] Google ScholarDigital Library
- R. Bhagwan, S. Savage, and G. Voelker. Understanding availability. In Proceedings of the 2nd International Workshop on Peer-to-peer Systems, Berkeley, CA, December 2002.]]Google Scholar
- L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. In Proceedings of IEEE INFOCOM 1999, March 1999.]]Google ScholarCross Ref
- L. Cherkasova and G. Ciardo. Characterizing locality, evolution, and life span of accesses in enterprise media server workloads. In Proceedings of the 12th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSDAV '02), Miami Beach, FL, May 2002.]] Google ScholarDigital Library
- M. E. Crovella and A. Bestavros. Self-similarity in world wide Web traffic: Evidence and possible causes. IEEE/ACM Transactions on Networking, 5(6):835--846, December 1997.]] Google ScholarDigital Library
- A. Dan, D. Sitaram, and P. Shahabuddin. Scheduling policies for an on-demand video server with batching. In Proceedings of ACM Multimedia 1994, October 1994.]] Google ScholarDigital Library
- F. Douglis, A. Feldmann, B. Krishnamurthy, and J. C. Mogul. Rate of change and other metrics: a live study of the World Wide Web. In Proceedings of the 1997 USENIX Symposium on Internet Technologies and Systems, Dec. 1997.]] Google ScholarDigital Library
- R. P. Doyle, J. S. Chase, S. Gadde, and A. M. Vahdat. The trickle-down effect: Web caching and server request distribution. In Proceedings of the Sixth International Workshop on Web Caching and Content Delivery, Boston, MA, June 2000.]]Google Scholar
- P. Francis, S. Jamin, C. Jin, Y. Jin, D. Raz, Y.Shavitt, and L. Zhang. IDMAPS: a global Internet host distance estimation service. In IEEE/ACM Transactions on Networking, October 2001.]] Google ScholarDigital Library
- S. Gadde, J. Chase, and M. Rabinovich. Web caching and content distribution: A view from the interior. In Proc. of the 5th International Web Caching and Content Delivery Workshop, May 2000.]]Google Scholar
- Z. Ge, D. R. Figueiredo, S. Jaiswal, J. Kurose, and D. Towsley. Modeling peer-peer file sharing systems. In Proceedings of INFOCOM 2003, Santa Fe, NM, October 2003.]]Google ScholarCross Ref
- C. Griwodz, M. Bar, and L. C. Wolf. Long-term movie popularity models in video-on-demand systems. In Proceedings of ACM Multimedia 1997, Seattle, WA, November 1997.]] Google ScholarDigital Library
- K. P. Gummadi, S. Saroiu, and S. D. Gribble. King: Estimating latency between arbitrary internet end hosts. In Proceedings of the Second SIGCOMM Internet Measurement Workshop (IMW 2002), Marseille, France, November 2002.]] Google ScholarDigital Library
- K. A. Hua and S. Sheu. Skyscraper broadcasting: A new broadcasting scheme for metropolitan video-on-demand systems. In Proceedings of ACM SIGCOMM 1997, Cannes, France, September 1997.]] Google ScholarDigital Library
- Kazaa. Homepage http://www.kazaa.com, July 2003.]]Google Scholar
- Keynote Systems Inc. Homepage at http://www.keynote.com, July 2003.]]Google Scholar
- J. Ledlie, J. Taylor, L. Serban, and M. Seltzer. Self-organization in peer-to-peer systems. In Proceedings of the 2002 SIGOPS European Workshop, St. Emilion, France, September 2002.]] Google ScholarDigital Library
- N. Leibowitz, A. Bergman, R. Ben-Shaul, and A. Shavit. Are file swapping networks cacheable? Characterizing P2P traffic. In Proc. of the 7th Int. WWW Caching Workshop, August 2002.]]Google Scholar
- D. Liben-Nowell, H. Balakrishnan, and D. Karger. Analysis of the evolution of peer-to-peer networks. In Proceedings of 2002 ACM Conference on the Principles of Distributed Computing, Monterey, CA, July 2002.]] Google ScholarDigital Library
- S. McCanne and V. Jacobson. The BSD packet filter: A new architecture for user-level packet capture. In Proceedings of the Winter USENIX Conference, pages 259--270, 1993.]] Google ScholarDigital Library
- E. Ng and H. Zhang. Predicting Internet network distance with coordinates-based approaches. In Proceedings of IEEE INFOCOM 2002, NewYork, NY, June 2002.]]Google ScholarCross Ref
- Nielsen Netratings, Inc., August 2003. http://www.nielsen-netratings.com.]]Google Scholar
- V. N. Padmanabhan and L. Qiu. The content and access dynamics of a busy Web site: Findings and implications. In Proceedings of ACM SIGCOMM 2000, August 2000.]] Google ScholarDigital Library
- D. Plonka. University of Wisconsin-Madison, Napster traffic measurement, March 2000. Available at http://net.doit.wisc.edu/data/Napster, March 2000.]]Google Scholar
- S. Saroiu, K. P. Gummadi, R. J. Dunn, S. D. Gribble, and H. M. Levy. An analysis of internet content delivery systems. In Proceedings of the Fifth Symposium on Operating Systems Design and Implementation (OSDI 2002), Boston, MA, December 2002.]] Google ScholarDigital Library
- S. Saroiu, P. K. Gummadi, and S. D. Gribble. A measurement study of peer-to-peer file sharing systems. In Proceedings of Multimedia Computing and Networking (MMCN) 2002, January 2002.]]Google Scholar
- J. Segarra and V. Cholvi. Distribution of video-on-demand in residential networks. Lecture Notes in Computer Science, 2158:50--61, 2001.]] Google ScholarDigital Library
- S. Sen and J. Wang. Analyzing peer-to-peer traffic across large networks. In Proceedings of the Second SIGCOMM Internet Measurement Workshop (IMW 2002), Marseille, France, November 2002.]] Google ScholarDigital Library
- W. Tang, Y. Fu, L. Cherkasova, and A. Vahdat. Long-term streaming media server workload analysis and modeling. Technical Report HPL-2003-23, HP Laboratories, January 2003.]]Google Scholar
- The Internet Movie Database, August 2003. http://www.imdb.com.]]Google Scholar
- Video Store Magazine, March 2000. Published by Avanstar Communications, http://www.videostoremag.com.]]Google Scholar
- L. Wang, V. Pai, and L. Peterson. The effectiveness of request redirection on CDN robustness. In Proceedings of the Fifth Symposium on Operating Systems Design and Implementation (OSDI 2002), Boston, MA, December 2002.]] Google ScholarDigital Library
- A. Wolman, G. Voelker, N. Sharma, N. Cardwell, M. Brown, T. Landray, D. Pinnel, A. Karlin, and H. Levy. Organization-based analysis of Web-object sharing and caching. In Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems, Oct. 1999.]] Google ScholarDigital Library
- A. Wolman, G. Voelker, N. Sharma, N. Cardwell, A. Karlin, and H. Levy. The scale and performance of cooperative Web proxy caching. In Proceedings of the 17th ACM Symposium on Operating Systems Principles, Dec. 1999.]] Google ScholarDigital Library
Index Terms
- Measurement, modeling, and analysis of a peer-to-peer file-sharing workload
Recommendations
Measurement, modeling, and analysis of a peer-to-peer file-sharing workload
SOSP '03: Proceedings of the nineteenth ACM symposium on Operating systems principlesPeer-to-peer (P2P) file sharing accounts for an astonishing volume of current Internet traffic. This paper probes deeply into modern P2P file sharing systems and the forces that drive them. By doing so, we seek to increase our understanding of P2P file ...
Free-Riding on BitTorrent-Like Peer-to-Peer File Sharing Systems: Modeling Analysis and Improvement
BitTorrent has emerged as a very popular peer-to-peer file sharing system, which uses an embedded set of incentive mechanisms to encourage contribution and prevent free-riding. However, the capability BitTorrent has of preventing free-riding needs ...
Characterizing unstructured overlay topologies in modern P2P file-sharing systems
In recent years, peer-to-peer (P2P) file-sharing systems have evolved to accommodate growing numbers of participating peers. In particular, new features have changed the properties of the unstructured overlay topologies formed by these peers. Little is ...
Comments