ABSTRACT
Infrastructure-as-a-Service (IaaS) cloud platforms rent resources, in the form of virtual machines (VMs), under a variety of contract terms that offer different levels of risk and cost. For example, users may acquire VMs in the spot market that are often cheap but entail significant risk, since their price varies over time based on market supply and demand and they may terminate at any time if the price rises too high. Currently, users must manage all the risks associated with using spot servers. As a result, conventional wisdom holds that spot servers are only appropriate for delay-tolerant batch applications. In this paper, we propose a derivative cloud platform, called SpotCheck, that transparently manages the risks associated with using spot servers for users.
SpotCheck provides the illusion of an IaaS platform that offers always-available VMs on demand for a cost near that of spot servers, and supports all types of applications, including interactive ones. SpotCheck's design combines the use of nested VMs with live bounded-time migration and novel server pool management policies to maximize availability, while balancing risk and cost. We implement SpotCheck on Amazon's EC2 and show that it i) provides nested VMs to users that are 99.9989% available, ii) achieves nearly 5x cost savings compared to using equivalent types of on-demand VMs, and iii) eliminates any risk of losing VM state.
Supplemental Material
- QEMU Microcheckpointing. http://wiki.qemu.org/Features/MicroCheckpointing.Google Scholar
- SPECjbb2005. https://www.spec.org/jbb2005/.Google Scholar
- TPC-W Benchmark. http://jmob.ow2.org/tpcw.html.Google Scholar
- Heroku. http://www.heroku.com, May 1st 2014.Google Scholar
- PiCloud. http://www.multyvac.com, May 1st 2014.Google Scholar
- RightScale. http://rightscale.com, May 1st 2014.Google Scholar
- Single Root I/O Virtualization. https://www.pcisig.com/specifications/iov/single_root/, May 1st 2014.Google Scholar
- AWS Case Study: Netflix. AWS Case Study: Netflix. http://aws.amazon.com/solutions/case-studies/netflix.Google Scholar
- G. Banga, P. Druschel, and J. Mogul. Resource Containers: A New Facility for Resource Management in Server Systems. In OSDI, February 1999. Google ScholarDigital Library
- J. Barr. New - EC2 Spot Instance Termination Notices. https://aws.amazon.com/blogs/aws/new-ec2-spot-instance-termination-notices/, January 6th 2015.Google Scholar
- M. Ben-Yehuda, M. Day, Z. Dubitzky, M. Factor, N. Har'El, A. Gordon, A. Liguori, O. Wasserman, and B. Yassour. The Turtles Project: Design and Implementation of Nested Virtualization. In OSDI, October 2010. Google ScholarDigital Library
- O. Ben-Yehuda, M. Ben-Yehuda, A. Schuster, and D. Tsafrir. Deconstructing Amazon EC2 Spot Instance Pricing. In Cloud-Com, November 2011.Google ScholarDigital Library
- D. Bernstein, E. Ludvigson, K. Sankar, S. Diamond, and M. Morrow. Blueprint for the Intercloud - Protocols and Formats for Cloud Computing Interoperability. In ICIW, 2009. Google ScholarDigital Library
- A. Bestavros and O. Krieger. Toward an Open Cloud Marketplace: Vision and First Steps. IEEE Internet Computing, 18 (1), January/February 2014. Google ScholarDigital Library
- N. Chohan, C. Castillo, M. Spreitzer, M. Steinder, A. Tantawi, and C. Krintz. See Spot Run: Using Spot Instances for MapReduce Workflows. In HotCloud, June 2010. Google ScholarDigital Library
- C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield. Live Migration of Virtual Machines. In NSDI, May 2005. Google ScholarDigital Library
- J. Clark. Amazon Cloud Goes Down in Northern Virginia. The Register, September 13th 2013.Google Scholar
- B. Cully, G. Lefebvre, D. Meyer, M. Feeley, N. Hutchinson, and A. Warfield. Remus: High Availability via Asynchronous Virtual Machine Replication. In NSDI, April 2008. Google ScholarDigital Library
- DRBD. DRBD: Software Development for High Availability Clusters. http://www.drbd.org/, September 2012.Google Scholar
- M. R. Hines, U. Deshpande, and K. Gopalan. Post-copy Live Migration of Virtual Machines. SIGOPS Operating Systems Review, 43(3), July 2009. Google ScholarDigital Library
- B. Javadi, R. Thulasiram, and R. Buyya. Statistical Modeling of Spot Instance Prices in Public Cloud Environments. In UCC, December 2011. Google ScholarDigital Library
- S. Khatua and N. Mukherjee. Application-centric Resource Provisioning for Amazon EC2 Spot Instances. In EuroPar, August 2013. Google ScholarDigital Library
- H. A. Lagar-Cavilla, J. A. Whitney, A. M. Scannell, P. Patchin, S. M. Rumble, E. De Lara, M. Brudno, and M. Satyanarayanan. SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing. In EuroSys, April 2009. Google ScholarDigital Library
- C. Liu and Y. Mao. Inception: Towards a Nested Cloud Architecture. In HotCloud, June 2013.Google Scholar
- H. Liu. Cutting MapReduce Cost with Spot Market. In HotCloud, June 2011. Google ScholarDigital Library
- Z. Liu, A. Wierman, Y. Chen, B. Razon, and N. Chen. Data Center Demand Response: Avoiding the Coincident Peak via Workload Shifting and Local Generation. 70(10), 2013. Google ScholarDigital Library
- M. Mao and M. Humphrey. A Performance Study on VM Startup Time in the Cloud. In CLOUD, June 2012. Google ScholarDigital Library
- M. Mattess, C. Vecchiola, and R. Buyya. Managing Peak Loads by Leasing Cloud Infrastructure Services from a Spot Market. In HPCC, September 2010. Google ScholarDigital Library
- D. Plummer. Cloud Services Brokerage: A Must-Have for Most Organizations. Forbes, March 22nd 2012.Google Scholar
- R. Singh, D. Irwin, P. Shenoy, and K. Ramakrishnan. Yank: Enabling Green Data Centers to Pull the Plug. In NSDI, April 2013. Google ScholarDigital Library
- R. Singh, P. Sharma, D. Irwin, P. Shenoy, and K. Ramakrishnan. Here Today, Gone Tomorrow: Exploiting Transient Servers in Data Centers. IEEE Internet Computing, 18(4), July/August 2014.Google Scholar
- Y. Song, M. Zafer, and K. Lee. Optimal Bidding in Spot Instance Market. In Infocom, March 2012.Google ScholarCross Ref
- S. Tang, J. Yuan, and X. Li. Towards Optimal Bidding Strategy for Amazon EC2 Cloud Spot Instance. In CLOUD, June 2012. Google ScholarDigital Library
- W. Voorsluys and R. Buyya. Reliable Provisioning of Spot Instances for Compute-Intensive Applications. In AINA, 2012. Google ScholarDigital Library
- D. Williams, H. Jamjoom, and H. Weatherspoon. The Xen-Blanket: Virtualize Once, Run Everywhere. In EuroSys, 2012. Google ScholarDigital Library
- D. Williams, H. Jamjoom, and H. Weatherspoon. Plug into the Supercloud. IEEE Internet Computing, 17(2), 2013. Google ScholarDigital Library
- H. Xu and B. Li. A Study of Pricing for Cloud Resources. Performance Evaluation Review, 40(4), March 2013. Google ScholarDigital Library
- S. Yi, D. Kondo, and A. Andrzejak. Reducing Costs of Spot Instances via Checkpointing in the Amazon Elastic Compute Cloud. In CLOUD, July 2010. Google ScholarDigital Library
- M. Zafer, Y. Song, and K. Lee. Optimal Bids for Spot VMs in a Cloud for Deadline Constrained Jobs. In CLOUD, 2012. Google ScholarDigital Library
- S. Zaman and D. Grosu. Efficient Bidding for Virtual Machine Instances in Clouds. In CLOUD, July 2011. Google ScholarDigital Library
- F. Zhang, J. Chen, H. Chen, and B. Zang. CloudVisor: Retrofitting Protection of Virtual Machines in Multi-tenant Cloud with Nested Virtualization. In SOSP, October 2011. Google ScholarDigital Library
Index Terms
- SpotCheck: designing a derivative IaaS cloud on the spot market
Recommendations
OS-independent live migration scheme for bare-metal clouds
UCC '15: Proceedings of the 8th International Conference on Utility and Cloud ComputingBare-metal clouds are an emerging and attractive platform for cloud users who demand extreme computer performance. Bare-metal clouds lease physical machines rather than virtual machines, eliminating a virtualization overhead and providing maximum ...
On-demand virtualization for live migration in bare metal cloud
SoCC '17: Proceedings of the 2017 Symposium on Cloud ComputingThe level of demand for bare-metal cloud services has increased rapidly because such services are cost-effective for several types of workloads, and some cloud clients prefer a single-tenant environment due to the lower security vulnerability of such ...
Comments