ABSTRACT
Prior research has shown that single-ISA heterogeneous chip multiprocessors have the potential for greater performance and energy efficiency than homogeneous CMPs. However, restricting the cores to a single ISA removes an important opportunity for greater heterogeneity. To take full advantage of a heterogeneous-ISA CMP, however, we must be able to migrate execution among heterogeneous cores in order to adapt to program phase changes and changing external conditions (e.g., system power state).
This paper explores migration on heterogeneous-ISA CMPs. This is non-trivial because program state is kept in an architecture-specific form; therefore, state transformation is necessary for migration. To keep migration cost low, the amount of state that requires transformation must be minimized. This work identifies large portions of program state whose form is not critical for performance; the compiler is modified to produce programs that keep most of their state in an architecture-neutral form so that only a small number of data items must be repositioned and no pointers need to be changed. The result is low migration cost with minimal sacrifice of non-migration performance.
Additionally, this work leverages binary translation to enable instantaneous migration. When migration is requested, the program is immediately migrated to a different core where binary translation runs for a short time until a function call is reached, at which point program state is transformed and execution continues natively on the new core.
This system can tolerate migrations as often as every 100 ms and still retain 95% of the performance of a system that does not do, or support, migration.
- ARM Limited. ARM Architecture Reference Manual.Google Scholar
- F. Bellard. Qemu, a fast and portable dynamic translator. In USENIX Technical Conference, Apr. 2005. Google ScholarDigital Library
- N. L. Binkert, R. G. Dreslinski, L. R. Hsu, K. T. Lim, A. G. Saidi, and S. K. Reinhardt. The M5 simulator: Modeling networked systems. International Symposium on Microarchitecture, Dec. 2006.Google ScholarDigital Library
- G. Bronevetsky, D. Marques, K. Pingali, and P. Stodghill. Automated application-level checkpointing of MPI programs. In Symposium on Principles and Practice of Parallel Programming, June 2003. Google ScholarDigital Library
- C-Port Corp. C-5 Network Processor Architecture Guide.Google Scholar
- J.-Y. Chen, W. Yang, T.-H. Hung, H.-M. Su, and W.-C. Hsu. A static binary translator for efficient migration of ARM-based applications. In Workshop on Optimizations for DSP and Embedded Systems, Apr. 2008.Google Scholar
- L. P. Deutsch and A. M. Schiffman. Efficient implementation of the smalltalk-80 system. In Symposium on Principles of Programming Languages, Jan. 1984. Google ScholarDigital Library
- F. B. Dubach, R. M. Rutherford, and C. M. Shub. Process-originated migration in a heterogeneous environment. In ACM Annual Computer Science Conference, Feb. 1989. Google ScholarDigital Library
- S. Dutta, R. Jensen, and A. Rieckmann. Viper: A multiprocessor SoC for advanced set-top box and digital TV systems. Design & Test of Computers, IEEE, 18(5), 2001. Google ScholarDigital Library
- R. Fernandes, K. Pingali, and P. Stodghill. Mobile MPI programs in computational grids. In Symposium on Principles and Practice of Parallel Programming, Mar. 2006. Google ScholarDigital Library
- A. Ferrari, S. J. Chapin, and A. Grimshaw. Heterogeneous process state capture and recovery through process introspection. Cluster Computing, 3(2), 2000. Google ScholarDigital Library
- M. Hill and M. Marty. Amdahl's law in the multicore era. Computer, July 2008. Google ScholarDigital Library
- J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, and D. Shippy. Introduction to the cell multiprocessor. IBM Journal of Research and Development, July 2005. Google ScholarDigital Library
- F. Karablieh, R. Bazzi, and M. Hicks. Compiler-assisted heterogeneous checkpointing. Oct. 2001.Google Scholar
- R. Kumar, K. I. Farkas, N. P. Jouppi, P. Ranganathan, and D. M. Tullsen. Single-ISA Heterogeneous Multi-core Architectures: The Potential for Processor Power Reduction. In International Symposium on Microarchitecture, Dec. 2003. Google ScholarDigital Library
- R. Kumar, D. M. Tullsen, N. Jouppi, and P. Ranganathan. Heterogeneous chip multiprocessors. Computer, 38(11), 2005. Google ScholarDigital Library
- R. Kumar, D. M. Tullsen, P. Ranganathan, N. P. Jouppi, and K. I. Farkas. Single-ISA Heterogeneous Multi-core Architectures for Multithreaded Workload Performance. In International Symposium on Computer Architecture, June 2004. Google ScholarDigital Library
- MIPS Technologies, Inc. MIPS32 Architecture for Programmers Volume II: The MIPS32 Instruction Set.Google Scholar
- B. Ramkumar and V. Strumpen. Portable checkpointing for heterogeneous archtitectures. International Symposium on Fault-Tolerant Computing, June 1997. Google ScholarDigital Library
- A. C. Ray and R. Hookway. DIGITAL FX!32 running 32-bit x86 applications on alpha NT. In USENIX Windows NT Workshop, Aug. 1997. Google ScholarDigital Library
- C. M. Shub. Native code process-originated migration in a heterogeneous environment. In ACM Conference on Cooperation, Feb. 1990. Google ScholarDigital Library
- J. Smith and R. Nair. Virtual Machines: Versatile Platforms for Systems and Processes. Morgan Kaufmann Publishers Inc., June 2005. Google ScholarDigital Library
- P. Smith and N. C. Hutchinson. Heterogeneous process migration: the Tui system. Software -- Practice and Experience, May 1998. Google ScholarDigital Library
- B. Steensgaard and E. Jul. Object and native code thread mobility among heterogeneous computers (includes sources). In Symposium on Operating Systems Principles, Dec. 1995. Google ScholarDigital Library
- V. Strumpen. Compiler technology for portable checkpoints. Technical report, Laboratory for Computer Science, Massachusetts Institute of Technology, 1998.Google Scholar
- V. Strumpen and B. Ramkumar. Portable checkpointing and recovery in heterogeneous environments. Technical report, University of Iowa, June 1996.Google Scholar
- Texas Instruments Inc. OMAP5912 Multimedia Processor Device Overview and Architecture Reference Guide.Google Scholar
- R. Veldema and M. Philippsen. Near overhead-free heterogeneous thread-migration. In Cluster Computing, Sept. 2005.Google Scholar
- D. G. von Bank, C. M. Shub, and R. W. Sebesta. A unified model of pointwise equivalence of procedural computations. ACM Transactions on Programming Languages and Systems, 16(6), 1994. Google ScholarDigital Library
- C. Zheng and C. Thompson. PA-RISC to IA-64: Transparent execution, no recompilation. Computer, Mar. 2000. Google ScholarDigital Library
Index Terms
- Execution migration in a heterogeneous-ISA chip multiprocessor
Recommendations
Breaking the Boundaries in Heterogeneous-ISA Datacenters
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating SystemsEnergy efficiency is one of the most important design considerations in running modern datacenters. Datacenter operating systems rely on software techniques such as execution migration to achieve energy efficiency across pools of machines. Execution ...
Execution migration in a heterogeneous-ISA chip multiprocessor
ASPLOS '12Prior research has shown that single-ISA heterogeneous chip multiprocessors have the potential for greater performance and energy efficiency than homogeneous CMPs. However, restricting the cores to a single ISA removes an important opportunity for ...
Execution migration in a heterogeneous-ISA chip multiprocessor
ASPLOS '12Prior research has shown that single-ISA heterogeneous chip multiprocessors have the potential for greater performance and energy efficiency than homogeneous CMPs. However, restricting the cores to a single ISA removes an important opportunity for ...
Comments