ABSTRACT
This paper looks at optimising the energy costs for storing user-generated content when accesses are highly skewed towards a few "popular" items, but the popularity ranks vary dynamically. Using traces from a video-sharing website and a social news website, it is shown that the non-popular content, which constitute the majority by numbers, tend to have accesses which spread locally in the social network, in a viral fashion. Based on the proportion of viral accesses, popular data is separated onto a few disks on storage. The popular disks receive the majority of accesses, allowing other disks to be spun down when there are no requests, saving energy.
Our technique, SpinThrift, improves upon Popular Data Concentration (PDC), which, in contrast with our binary separation between popular and unpopular items, directs the majority of accesses to a few disks by arranging data according to popularity rank. Disregarding the energy required for data reorganisation, SpinThrift and PDC display similar energy savings. However, because of the dyamically changing popularity ranks, SpinThrift requires less than half the number of data reorderings compared to PDC.
- Aral, S., Muchnik, L., and Sundararajan, A. Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks. Proceedings of the National Academy of Sciences 106, 51 (2009), 21544--21549.Google ScholarCross Ref
- Cha, M., Kwak, H., Rodriguez, P., Ahn, Y.-Y., and Moon, S. I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system. In IMC '07: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement (2007). Google ScholarDigital Library
- Cha, M., Mislove, A., and Gummadi, K. P. A measurement-driven analysis of information propagation in the flickr social network. In WWW '09: Proceedings of the 18th international conference on World wide web (New York, NY, USA, 2009), ACM, pp. 721--730. Google ScholarDigital Library
- Chen, M., Stein, L., and Zhang, Z. Dependability, access diversity, low cost: pick two. In HotDep'07: Proceedings of the 3rd workshop on on Hot Topics in System Dependability (2007). Google ScholarDigital Library
- Colarelli, D., and Grunwald, D. Massive arrays of idle disks for storage archives. In Supercomputing '02: Proceedings of the 2002 ACM/IEEE conference on Supercomputing (2002), pp. 1--11. Google ScholarDigital Library
- Digg. Digg faq. digg.com/faq.Google Scholar
- Ganesh, L., Weatherspoon, H., Balakrishnan, M., and Birman, K. Optimizing power consumption in large scale storage systems. In 11th USENIX Workshop on Hot Topics in Operating Systems (May 2007). Google ScholarDigital Library
- Gladwell, M. Tipping Point. Back Bay Books, 2002.Google Scholar
- Gurumurthi, S., Zhang, J., Sivasubramaniam, A., Kandemir, M., Franke, H., Vijaykrishnan, N., and Irwin, M. J. Interplay of energy and performance for disk arrays running transaction processing workloads. In ISPASS '03: Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software (2003). Google ScholarDigital Library
- Leskovec, J., Adamic, L. A., and Huberman, B. A. The dynamics of viral marketing. ACM Trans. Web 1, 1 (2007), 5. Google ScholarDigital Library
- Narayanan, D., Donnelly, A., and Rowstron, A. Write off-loading: Practical power management for enterprise storage. Trans. Storage 4, 3 (2008). Google ScholarDigital Library
- Pinheiro, E., and Bianchini, R. Energy conservation techniques for disk array-based servers. In ICS '04: Proceedings of the 18th annual international conference on Supercomputing (2004), pp. 68--78. Google ScholarDigital Library
- Sastry, N., Yoneki, E., and Crowcroft, J. Buzztraq: predicting geographical access patterns of social cascades using social networks. In SNS '09: Proceedings of the Second ACM EuroSys Workshop on Social Network Systems (2009). Google ScholarDigital Library
- Zhou, Y., Philbin, J., and Li, K. The multi-queue replacement algorithm for second level buffer caches. In Proceedings of the General Track: 2002 USENIX Annual Technical Conference (2001). Google ScholarDigital Library
Index Terms
- SpinThrift: saving energy in viral workloads
Recommendations
SpinThrift: saving energy in viral workloads
COMSNETS'10: Proceedings of the 2nd international conference on COMmunication systems and NETworksThis paper looks at optimising the energy costs for data storage when the work load is highly skewed by a large number of accesses from a few popular articles, but whose popularity varies dynamically. A typical example of such a work load is news ...
Identifying the influential bloggers in a community
WSDM '08: Proceedings of the 2008 International Conference on Web Search and Data MiningBlogging becomes a popular way for a Web user to publish information on the Web. Bloggers write blog posts, share their likes and dislikes, voice their opinions, provide suggestions, report news, and form groups in Blogosphere. Bloggers form their ...
Disinformation Warfare: Understanding State-Sponsored Trolls on Twitter and Their Influence on the Web
WWW '19: Companion Proceedings of The 2019 World Wide Web ConferenceOver the past couple of years, anecdotal evidence has emerged linking coordinated campaigns by state-sponsored actors with efforts to manipulate public opinion on the Web, often around major political events, through dedicated accounts, or “trolls.” ...
Comments