ABSTRACT
Double-paging is an often-cited, if unsubstantiated, problem in multi-level scheduling of memory between virtual machines (VMs) and the hypervisor. This problem occurs when both a virtualized guest and the hypervisor overcommit their respective physical address-spaces. When the guest pages out memory previously swapped out by the hypervisor, it initiates an expensive sequence of steps causing the contents to be read in from the hypervisor swapfile only to be written out again, significantly lengthening the time to complete the guest I/O request. As a result, performance rapidly drops.
We present Tesseract, a system that directly and transparently addresses the double-paging problem. Tesseract tracks when guest and hypervisor I/O operations are redundant and modifies these I/Os to create indirections to existing disk blocks containing the page contents. Although our focus is on reconciling I/Os between the guest disks and hypervisor swap, our technique is general and can reconcile, or deduplicate, I/Os for guest pages read or written by the VM.
Deduplication of disk blocks for file contents accessed in a common manner is well-understood. One challenge that our approach faces is that the locality of guest I/Os (reflecting the guest's notion of disk layout) often differs from that of the blocks in the hypervisor swap. This loss of locality through indirection results in significant performance loss on subsequent guest reads. We propose two alternatives to recovering this lost locality, each based on the idea of asynchronously reorganizing the indirected blocks in persistent storage.
We evaluate our system and show that it can significantly reduce the costs of double-paging. We focus our experiments on a synthetic benchmark designed to highlight its effects. In our experiments we observe Tesseract can improve our benchmark's throughput by as much as 200% when using traditional disks and by as much as 30% when using SSD. At the same time worst case application responsiveness can be improved by a factor of 5.
- VMware vSphere hypervisor. http://www.vmware.com/go/ESXiInfoCenter.Google Scholar
- Jenkins. http://jenkins-ci.org.Google Scholar
- VMware Workstation. http://www.vmware.com/products/workstation.Google Scholar
- O. Agesen. US patent 8380939: System/method for maintaining memory page sharing in a virtual environment, 2011.Google Scholar
- N. Amit, D. Tsafrir, and A. Schuster. VSwapper: A memory swapper for virtualized environments. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems , ASPLOS- XIX, 2014. Google ScholarDigital Library
- R. P. Goldberg and R. Hassinger. The double paging anomaly. In Proceedings of the May 6--10, 1974, National Computer Conference and Exposition, AFIPS '74, pages 195--199, 1974. Google ScholarDigital Library
- K. Govil. Virtual Clusters: Resource Management on Large Shared-Memory Multiprocessors. PhD thesis, Stanford University, Palo Alto, CA, USA, 2001. Google ScholarDigital Library
- K. Govil, D. Teodosiu, Y. Huang, and M. Rosenblum. Cellular Disco: Resource management using virtual clusters on shared- memory multiprocessors. ACM Trans. Comput. Syst., 18:229--262, August 2000. Google ScholarDigital Library
- A. Gulati, I. Ahmad, and C. A. Waldspurger. Parda: Proportional allocation of resources for distributed storage access. In Proceedings of the 7th Conference on File and Storage Technologies, FAST '09, pages 85--98, 2009. Google ScholarDigital Library
- S. T. Jones, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Geiger: monitoring the buffer cache in a virtual machine environment. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS-XII, pages 14--24, 2006. Google ScholarDigital Library
- P. Lu and K. Shen. Virtual machine memory access tracing with hypervisor exclusive cache. In Proceedings of the 2007 USENIX Annual Technical Conference, USENIX'09, pages 3:1--3:15, 2007. Google ScholarDigital Library
- P. Manning and J. Dieckhans. Storage I/O control technical overview and considerations for deployment. 2010. URL http://www.vmware.com/files/pdf/techpaper/VMW-vSphere41-SIOC.pdf.Google Scholar
- G. Miós, D. G. Murray, S. Hand, and M. A. Fetterman. Satori: enlightened page sharing. In Proceedings of the 2009 USENIX Annual Technical Conference, USENIX'09, 2009. Google ScholarDigital Library
- E. Park, B. Egger, and J. Lee. Fast and space-efficient virtual machine checkpointing. In Proceedings of the 7th ACM SIG- PLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '11, pages 75--86, 2011. Google ScholarDigital Library
- L. Seawright and R. MacKinnon. VM/370 - a study of multiplicity and usefulness. IBM Sys. Jrnl, 18(1):4--17, 1979. Google ScholarDigital Library
- Standard Performance Evaluation Corporation. SPECjbb2005. http://www.spec.org/jbb2005.Google Scholar
- C. A. Waldspurger. Memory resource management in VMware ESX server. SIGOPS Oper. Syst. Rev., 36:181--194, Dec. 2002. Google ScholarDigital Library
Index Terms
- Tesseract: reconciling guest I/O and hypervisor swapping in a VM
Recommendations
Proactively Breaking Large Pages to Improve Memory Overcommitment Performance in VMware ESXi
VEE '15: Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution EnvironmentsVMware ESXi leverages hardware support for MMU virtualization available in modern Intel/AMD CPUs. To optimize address translation performance when running on such CPUs, ESXi preferably uses host large pages (2MB in x86-64 systems) to back VM's guest ...
Tesseract: reconciling guest I/O and hypervisor swapping in a VM
VEE '14Double-paging is an often-cited, if unsubstantiated, problem in multi-level scheduling of memory between virtual machines (VMs) and the hypervisor. This problem occurs when both a virtualized guest and the hypervisor overcommit their respective physical ...
VSwapper: a memory swapper for virtualized environments
ASPLOS '14: Proceedings of the 19th international conference on Architectural support for programming languages and operating systemsThe number of guest virtual machines that can be consolidated on one physical host is typically limited by the memory size, motivating memory overcommitment. Guests are given a choice to either install a "balloon" driver to coordinate the overcommitment ...
Comments