ABSTRACT
Direct device assignment enhances the performance of guest virtual machines by allowing them to communicate with I/O devices without host involvement. But even with device assignment, guests are still unable to approach bare-metal performance, because the host intercepts all interrupts, including those interrupts generated by assigned devices to signal to guests the completion of their I/O requests. The host involvement induces multiple unwarranted guest/host context switches, which significantly hamper the performance of I/O intensive workloads. To solve this problem, we present ELI (ExitLess Interrupts), a software-only approach for handling interrupts within guest virtual machines directly and securely. By removing the host from the interrupt handling path, ELI manages to improve the throughput and latency of unmodified, untrusted guests by 1.3x-1.6x, allowing them to reach 97%-100% of bare-metal performance even for the most demanding I/O-intensive workloads.
- ABRAMSON, D., JACKSON, J., MUTHRASANALLUR, S., NEIGER, G., REGNIER, G., SANKARAN, R., SCHOINAS, I., UHLIG, R., VEMBU, B., AND WIEGERT, J. Intel virtualization technology for directed I/O. Intel Technology Journal 10, 3 (2006), 179--192.Google ScholarCross Ref
- ADAMS, K., AND AGESEN, O. A comparison of software and hardware techniques for x86 virtualization. In ACM Architectural Support for Programming Languages & Operating Systems (ASPLOS) (2006). Google ScholarDigital Library
- AGESEN, O., MATTSON, J., RUGINA, R., AND SHELDON, J. Software techniques for avoiding hardware virtualization exits. Tech. rep., VMware, 2011.Google Scholar
- AHMAD, I., GULATI, A., AND MASHTIZADEH, A. vIC: Interrupt coalescing for virtual machine storage device IO. In USENIX Annual Technical Conference (ATC) (2011). Google ScholarDigital Library
- AMD INC. AMD64 Architecture Programmer's Manual Volume 2: System Programming, 2011.Google Scholar
- AMIT, N., BEN-YEHUDA, M., TSAFRIR, D., AND SCHUSTER, A. vIOMMU: efficient IOMMU emulation. In USENIX Annual Technical Conference (ATC) (2011). Google ScholarDigital Library
- BARHAM, P., DRAGOVIC, B., FRASER, K., HAND, S., HARRIS, T., HO, A., NEUGEBAUER, R., PRATT, I., AND WARFIELD, A. Xen and the art of virtualization. In ACM Symposium on Operating Systems Principles (SOSP) (2003). Google ScholarDigital Library
- BEN-YEHUDA, M., BOROVIK, E., FACTOR, M., ROM, E., TRAEGER, A., AND YASSOUR, B.-A. Adding advanced storage controller functionality via low-overhead virtualization. In USENIX Conference on File & Storage Technologies (FAST) (2012). Google ScholarDigital Library
- BEN-YEHUDA, M., DAY, M. D., DUBITZKY, Z., FACTOR, M., HAR'EL, N., GORDON, A., LIGUORI, A., WASSERMAN, O., AND YASSOUR, B.-A. The Turtles project: Design and implementation of nested virtualization. In USENIX Symposium on Operating Systems Design & Implementation (OSDI) (2010). Google ScholarDigital Library
- BETAK, T., DULEY, A., AND ANGEPAT, H. Reflective virtualization improving the performance of fully-virtualized x86 operating systems. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.129.7868.Google Scholar
- BHARGAVA, R., SEREBRIN, B., SPADINI, F., AND MANNE, S. Accelerating two-dimensional page walks for virtualized systems. In ACM Architectural Support for Programming Languages & Operating Systems (ASPLOS) (2008). Google ScholarDigital Library
- BOVET, D., AND CESATI, M. Understanding the Linux Kernel, Second Edition. O'Reilly & Associates, Inc., 2002. Google ScholarDigital Library
- CODD, E. F. Advances in Computers, vol. 3. New York: Academic Press, 1962, pp. 77--153.Google Scholar
- DONG, Y., XU, D., ZHANG, Y., AND LIAO, G. Optimizing network I/O virtualization with efficient interrupt coalescing and virtual receive side scaling. In IEEE International Conference on Cluster Computing (CLUSTER) (2011). Google ScholarDigital Library
- DONG, Y., YANG, X., LI, X., LI, J., TIAN, K., AND GUAN, H. High performance network virtualization with SR-IOV. In IEEE International Symposium on High Performance Computer Architecture (HPCA) (2010).Google ScholarCross Ref
- DONG, Y., YU, Z., AND ROSE, G. SR-IOV networking in Xen: architecture, design and implementation. In USENIX Workshop on I/O Virtualization (WIOV) (2008). Google ScholarDigital Library
- DOVROLIS, C., THAYER, B., AND RAMANATHAN, P. HIP: hybrid interrupt-polling for the network interface. ACM SIGOPS Operating Systems Review (OSR) 35 (2001), 50--60. Google ScholarDigital Library
- FITZPATRICK, B. Distributed caching with memcached. Linux Journal, 124 (2004). Google ScholarDigital Library
- GAVRILOVSKA, A., KUMAR, S., RAJ, H., SCHWAN, K., GUPTA, V., NATHUJI, R., NIRANJAN, R., RANADIVE, A., AND SARAIYA, P. High performance hypervisor architectures: Virtualization in HPC systems. In Workshop on System-level Virtualization for HPC (HPCVirt) (2007).Google Scholar
- INTEL CORPORATION. Intel 64 Architecture x2APIC Specification, 2008.Google Scholar
- INTEL CORPORATION. Intel 64 and IA-32 Architectures Software Developer's Manual, 2010.Google Scholar
- ITZKOVITZ, A., AND SCHUSTER, A. MultiView and MilliPage-fine-grain sharing in page-based DSMs. In USENIX Symposium on Operating Systems Design & Implementation (OSDI) (1999). Google ScholarDigital Library
- JONES, R. A. A network performance benchmark (revision 2.0). Tech. rep., Hewlett Packard, 1995.Google Scholar
- KELLER, E., SZEFER, J., REXFORD, J., AND LEE, R. B. Nohype: virtualized cloud infrastructure without the virtualization. In ACM/IEEE International Symposium on Computer Architecture (ISCA) (2010), ACM. Google ScholarDigital Library
- KIVITY, A., KAMAY, Y., LAOR, D., LUBLIN, U., AND LIGUORI, A. KVM: the Linux virtual machine monitor. In Ottawa Linux Symposium (OLS) (2007).Google Scholar
- LANDAU, A., BEN-YEHUDA, M., AND GORDON, A. SplitX: Split guest/hypervisor execution on multi-core. In USENIX Workshop on I/O Virtualization (WIOV) (2011). Google ScholarDigital Library
- LANGE, J. R., PEDRETTI, K., DINDA, P., BRIDGES, P. G., BAE, C., SOLTERO, P., AND MERRITT, A. Minimal-overhead virtualization of a large scale supercomputer. In ACM/USENIX International Conference on Virtual Execution Environments (VEE) (2011). Google ScholarDigital Library
- LARSEN, S., SARANGAM, P., HUGGAHALLI, R., AND KULKARNI, S. Architectural breakdown of end-to-end latency in a TCP/IP network. In International Symposium on Computer Architecture and High Performance Computing (2009).Google ScholarDigital Library
- LEVASSEUR, J., UHLIG, V., STOESS, J., AND GÖTZ, S. Unmodified device driver reuse and improved system dependability via virtual machines. In USENIX Symposium on Operating Systems Design & Implementation (OSDI) (2004). Google ScholarDigital Library
- LIAO, G., GUO, D., BHUYAN, L., AND KING, S. R. Software techniques to improve virtualized I/O performance on multi-core systems. In ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS) (2008). Google ScholarDigital Library
- LIU, J. Evaluating standard-based self-virtualizing devices: A performance study on 10 GbE NICs with SR-IOV support. In IEEE International Parallel & Distributed Processing Symposium (IPDPS) (2010).Google ScholarCross Ref
- LIU, J., HUANG, W., ABALI, B., AND PANDA, D. K. High performance VMM-bypass I/O in virtual machines. In USENIX Annual Technical Conference (ATC) (2006), pp. 29--42. Google ScholarDigital Library
- MENON, A., SANTOS, J. R., TURNER, Y., JANAKIRAMAN, G., AND ZWAENEPOEL, W. Diagnosing performance overheads in the Xen virtual machine environment. In ACM/USENIX International Conference on Virtual Execution Environments (VEE) (2005), pp. 13--23. Google ScholarDigital Library
- MOGUL, J. C., AND RAMAKRISHNAN, K. K. Eliminating receive livelock in an interrupt-driven kernel. ACM Transactions on Computer Systems (TOCS) 15 (1997), 217--252. Google ScholarDigital Library
- NAVARRO, J., IYER, S., DRUSCHEL, P., AND COX, A. Practical, transparent operating system support for superpages. In USENIX Symposium on Operating Systems Design & Implementation (OSDI) (2002). Google ScholarDigital Library
- POPEK, G. J., AND GOLDBERG, R. P. Formal requirements for virtualizable third generation architectures. Communications of the ACM (CACM) 17 (1974), 412--421. Google ScholarDigital Library
- RAJ, H., AND SCHWAN, K. High performance and scalable I/O virtualization via self-virtualized devices. In International Symposium on High Performance Distributed Computer (HPDC) (2007). Google ScholarDigital Library
- RAM, K. K., SANTOS, J. R., TURNER, Y., COX, A. L., AND RIXNER, S. Achieving 10Gbps using safe and transparent network interface virtualization. In ACM/USENIX International Conference on Virtual Execution Environments (VEE) (2009). Google ScholarDigital Library
- ROSS, T. L., WASHABAUGH, D. M., ROMAN, P. J., CHEUNG, W., TANAKA, K., AND MIZUGUCHI, S. Method and apparatus for performing interrupt frequency mitigation in a network node. US Patent 6,115,775, 2000.Google Scholar
- RUMBLE, S., ONGARO, D., STUTSMAN, R., ROSENBLUM, M., AND OUSTERHOUT, J. It's time for low latency. In USENIX Workshop on Hot Topics in Operating Systems (HOTOS) (2011). Google ScholarDigital Library
- RUSSELL, R. virtio: towards a de-facto standard for virtual I/O devices. ACM SIGOPS Operating Systems Review (OSR) 42, 5 (2008), 95--103. Google ScholarDigital Library
- RUSSINOVICH, M. E., AND SOLOMON, D. A. Microsoft Windows Internals, Fourth Edition: Microsoft Windows Server(TM) 2003,Windows XP, and Windows 2000 (Pro-Developer). Microsoft Press, 2004. Google ScholarDigital Library
- SALAH, K. To coalesce or not to coalesce. International Journal of Electronics and Communications 61, 4 (2007), 215--225.Google ScholarCross Ref
- SALAH, K., AND QAHTAN, A. Boosting throughput of Snort NIDS under Linux. In International Conference on Innovations in Information Technology (IIT) (2008).Google ScholarCross Ref
- SALIM, J. H., OLSSON, R., AND KUZNETSOV, A. Beyond Softnet. In Anual Linux Showcase & Conference (2001). Google ScholarDigital Library
- SANTOS, J. R., TURNER, Y., JANAKIRAMAN, J. G., AND PRATT, I. Bridging the gap between software and hardware techniques for I/O virtualization. In USENIX Annual Technical Conference (ATC) (2008). Google ScholarDigital Library
- SUGERMAN, J., VENKITACHALAM, G., AND LIM, B.-H. Virtualizing I/O devices on Vmware workstation's hosted virtual machine monitor. In USENIX Annual Technical Conference (ATC) (2001), pp. 1--14. Google ScholarDigital Library
- SZEFER, J., KELLER, E., LEE, R. B., AND REXFORD, J. Eliminating the hypervisor attack surface for a more secure cloud. In ACM Conference on Computer and Communications Security (CCS) (2011). Google ScholarDigital Library
- TSAFRIR, D., ETSION, Y., FEITELSON, D. G., AND KIRKPATRICK, S. System noise, OS clock ticks, and fine-grained parallel applications. In ACM International Conference on Supercomputing (ICS) (2005), pp. 303--312. Google ScholarDigital Library
- UHLIG, R., NEIGER, G., RODGERS, D., SANTONI, A. L., MARTINS, F. C. M., ANDERSON, A. V., BENNETT, S. M., KAGI, A., LEUNG, F. H., AND SMITH, L. Intel virtualization technology. Computer 38, 5 (2005), 48--56. Google ScholarDigital Library
- WILLMANN, P., SHAFER, J., CARR, D., MENON, A., RIXNER, S., COX, A. L., AND ZWAENEPOEL, W. Concurrent direct network access for virtual machine monitors. In IEEE International Symposium on High Performance Computer Architecture (HPCA) (2007). Google ScholarDigital Library
- WOJTCZUK, R., AND RUTKOWSKA, J. Following the White Rabbit: Software attacks against Intel VT-d technology. http://invisiblethingslab.com/resources/2011/Software%20Attacks%20on%20Intel%20VT-d.pdf. (Accessed Jul, 2011).Google Scholar
- YASSOUR, B.-A., BEN-YEHUDA, M., AND WASSERMAN, O. Direct device assignment for untrusted fully-virtualized virtual machines. Tech. Rep. H-0263, IBM Research, 2008.Google Scholar
- ZEC, M., MIKUC, M., AND ZAGAR, M. Estimating the Impact of Interrupt Coalescing Delays on Steady State TCP Throughput. In International Conference on Software, Telecommunications and Computer Networks (SoftCOM) (2002).Google Scholar
Index Terms
- ELI: bare-metal performance for I/O virtualization
Recommendations
ELI: bare-metal performance for I/O virtualization
ASPLOS '12Direct device assignment enhances the performance of guest virtual machines by allowing them to communicate with I/O devices without host involvement. But even with device assignment, guests are still unable to approach bare-metal performance, because ...
ELI: bare-metal performance for I/O virtualization
ASPLOS '12Direct device assignment enhances the performance of guest virtual machines by allowing them to communicate with I/O devices without host involvement. But even with device assignment, guests are still unable to approach bare-metal performance, because ...
A Comprehensive Implementation and Evaluation of Direct Interrupt Delivery
VEE '15As the performance overhead associated with CPU and memory virtualization becomes largely negligible, research efforts are directed toward reducing the I/O virtualization overhead, which mainly comes from two sources: DMA set-up and payload copy, and ...
Comments