Abstract
Over the last years embedded system industry faced a revolution thanks to the introduction of multicores and heterogeneous devices. The availability of these new platforms opens new paths for these devices that can be nowadays used for more high demand tasks, exploiting the parallelism made available by the muticore processors. Nonetheless the progresses of the HW technology are not backed up by improvements of the SW side, and runtime mechanisms to manage resource allocation and contention on resources are still lacking the proper effectiveness. This paper tackles the problem of dynamic resource management from the application point of view and presents a user space library to control application performance. The control knob exploited by the library is the possibility of scaling the number of threads used by an application and seamlessly integrates with OpenMP. A case study illustrates the benefits that this library has in a classic embedded system scenario, introducing an overhead of less than 0.5%.
- N. Rajovic et al., "Supercomputing with commodity cpus: are mobile socs ready for hpc?" in High Performance Computing, Networking, Storage and Analysis (SC), International Conf. for. IEEE, 2013.Google Scholar
- L. Dagum et al., "Openmp: an industry standard api for shared-memory programming," Computational Science & Engineering, IEEE, vol. 5, no. 1, 1998.Google Scholar
- A. Sharifi et al., "Mete: meeting end-to-end qos in multicores through system-wide resource management," in Proc. of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems. ACM, 2011.Google Scholar
- D. B. Bartolini et al., "Towards a performance-as-a-service cloud," in Proc. of the 4th annual Symposium on Cloud Computing. ACM, 2013.Google Scholar
- H. Hoffmann et al., "Application heartbeats: a generic interface for specifying program performance and goals in autonomous computing environments," in Proceedings of the 7th international conference on Autonomic computing. ACM, 2010.Google Scholar
- G. Durelli, "Source code." [Online]. Available: https://bitbucket.org/durellinux/libthreadscale-codeGoogle Scholar
- R. Nathuji et al., "Q-clouds: managing performance interference effects for qos-aware clouds," in Proceedings of the 5th European conference on Computer systems. ACM, 2010.Google Scholar
- D. B. Bartolini et al., "The autonomic operating system research project: achievements and future directions," in Proceedings of the 50th Annual Design Automation Conference. ACM, 2013.Google Scholar
- F. Sironi et al., "Metronome: operating system level performance management via self-adaptive computing," in Design Automation Conference (DAC), 2012 49th. IEEE, 2012.Google Scholar
- F. X. Lin et al., "K2: a mobile operating system for heterogeneous coherence domains," in Proc. of the 19th int. conf. on Architectural support for programming languages and operating systems. ACM, 2014.Google Scholar
- H. Hoffmann and M. Maggio, "Pcp: A generalized approach to optimizing performance under power constraints through resource management," in 11th Int. Conf. on Autonomic Computing, 2014.Google Scholar
- H. Hoffmann, "Seec: A general and extensible framework for self-aware computing, âĂİ massachusetts institute of technology," Tech. Rep, Tech. Rep., 2011.Google Scholar
- J. R. Wernsing and G. Stitt, "Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing," in ACM SIGPLAN Notices, vol. 45. ACM, 2010.Google Scholar
- D. B. Bartolini et al., "Automated fine-grained cpu provisioning for virtual machines," ACM Trans. on Architecture and Code Optimization, vol. 11, 2014.Google ScholarDigital Library
- A. Sharifi et al., "Courteous cache sharing: Being nice to others in capacity management," in Design Automation Conf. (DAC), 2012 49th. IEEE, 2012.Google Scholar
Recommendations
Thread shuffling: combining DVFS and thread migration toreduce energy consumptions for multi-core systems
ISLPED '11: Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and designIn recent years, multi-core systems have become mainstream in computer industry. The design of multi-cores takes advantage of thread-level parallelism in emerging applications that are computationally intensive and highly parallel. Energy efficiency is ...
Thread reinforcer: Dynamically determining number of threads via OS level monitoring
IISWC '11: Proceedings of the 2011 IEEE International Symposium on Workload CharacterizationIt is often assumed that to maximize the performance of a multithreaded application, the number of threads created should equal the number of cores. While this may be true for systems with four or eight cores, this is not true for systems with larger ...
Optimizing stencil application on multi-thread GPU architecture using stream programming model
ARCS'10: Proceedings of the 23rd international conference on Architecture of Computing SystemsWith fast development of GPU hardware and software, using GPUs to accelerate non-graphics CPU applications is becoming inevitable trend. GPUs are good at performing ALU-intensive computation and feature high peak performance; however, how to harness ...
Comments