ABSTRACT
Two overriding concerns in the development of embedded MPSoCs are ease of programming and hardware complexity. In this paper we present SoC-TM, an integrated HW/SW solution for transactional programming on embedded MPSoCs. Our proposal leverages a Hardware Transactional Memory (HTM) design, based on a dedicated HW module for conflict management, whose functionality is exposed to the software through compiler directives, implemented as an extension to the popular OpenMP programming model. To further improve ease of programming, our framework supports speculative parallelism, thanks to the ability of enforcing a given commit order in hardware. Our experimental results confirm that SoC-TM is a viable and cost-effective solution for embedded MPSoCs, in terms of energy, performance and productivity.
- W. Baek, C. C. Minh, M. Trautmann, C. Kozyrakis, and K. Olukotun. The OpenTM transactional application programming interface. In PACT, pages 376--387, 2007. Google ScholarDigital Library
- B. H. Bloom. Space/time trade-offs in hash coding with allowable errors. Commun. ACM, 13:422--426, July 1970. Google ScholarDigital Library
- L. Ceze, J. Tuck, J. Torrellas, and C. Cascaval. Bulk disambiguation of speculative threads in multiprocessors. In ISCA, pages 227--238, 2006. Google ScholarDigital Library
- EEMBC. Eembc, the embedded microprocessor benchmark consortium. http://www.eembc.org.Google Scholar
- C. Ferri, S. Wood, T. Moreshet, R. I. Bahar, and M. Herlihy. Embedded-tm: Energy and complexity-effective hardware transactional memory for embedded multicore systems. Journal of Parallel and Distributed Computing, 70(10):1042--1052, October 2010. Google ScholarDigital Library
- M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. Mibench: A free, commercially representative embedded benchmark suite. In IEEE International Workshop on Workload Characterization, pages 3--14, 2001. Google ScholarDigital Library
- T. Harris, J. R. Larus, and R. Rajwar. Transactional memory, 2nd edition. Synthesis Lectures on Computer Architecture, 5(1):1--263, 2010. Google ScholarDigital Library
- A. Kejariwal, A. V. Veidenbaum, A. Nicolau, M. Girkar, X. Tian, and H. Saito. On the exploitation of loop-level parallelism in embedded applications. ACM Trans. Embed. Comput. Syst., 8:10:1--10:34, February 2009. Google ScholarDigital Library
- L. Kunz, G. Girão, and F. Wagner. Evaluation of a hardware transactional memory model in an NoC-based embedded MPSoC. In SBCCI, pages 85--90, São Paulo, Brazil, 2010. Google ScholarDigital Library
- F. Liu and V. Chaudhary. Extending OpenMP for heterogeneous chip multiprocessors. In International Conference on Parallel Processing, pages 161--168, 2003.Google ScholarCross Ref
- A. Marongiu and L. Benini. An OpenMP compiler for efficient use of distributed scratchpad memory in MPSoCs. Computers, IEEE Transactions on, PP(99):1, 2010. Google ScholarDigital Library
- M. Mehrara, J. Hao, P.-C. Hsu, and S. Mahlke. Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory. SIGPLAN Not., 44:166--176, June 2009. Google ScholarDigital Library
- Q. Meunier and F. Petrot. Lightweight transactional memory systems for nocs based architectures: Design, implementation and comparison of two policies. Journal of Parallel and Distributed Computing, 70(10):1024--1041, October 2010. Google ScholarDigital Library
- M. Milovanovic, R. Ferrer, O. Unsal, A. Cristal, X. Martorell, E. Ayguadé, J. Labarta, and M. Valero. Transactional memory and OpenMP. In B. Chapman, editor, A Practical Programming Model for the Multi-Core Era, pages 37--53. Springer Berlin/Heidelberg, 2008. Google ScholarDigital Library
- C. C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford transactional applications for multi-processing. In International Symposium on Workload Characterization, Sept. 2008.Google Scholar
- NVIDIA website. NVIDIA Tegra-2. http://www.nvidia.com/object/tegra-2.html.Google Scholar
- K. O'Brien, K. O'Brien, Z. Sura, T. Chen, and T. Zhang. Supporting OpenMP on cell. In International Workshop on OpenMP, pages 65--76, 2008. Google ScholarDigital Library
- M. K. Prabhu and K. Olukotun. Exposing speculative thread parallelism in spec2000. In ACM Symposium on Principles and Practice of Parallel Programming, pages 142--152, 2005. Google ScholarDigital Library
- Qualcomm Inc. Snapdragon MSM8660 and APQ8060 Product Brief. http://www.qualcomm.com/documents/snapdragon-msm8x60-apq8060-product-brief.Google Scholar
- R. Quislant, E. Gutierrez, O. Plata, and E. L. Zapata. Improving signatures by locality exploitation for transactional memory. In PACT, pages 303--312, 2009. Google ScholarDigital Library
- D. Sanchez, L. Yen, M. D. Hill, and K. Sankaralingam. Implementing signatures for transactional memory. In International Symposium on Microarchitecture, pages 123--133, 2007. Google ScholarDigital Library
- A. Shriraman, S. Dwarkadas, and M. L. Scott. Implementation tradeoffs in the design of flexible transactional memory support. Journal of Parallel and Distributed Computing, 70(10):1068--1084, October 2010. Google ScholarDigital Library
- C. von Praun, L. Ceze, and C. Caşcaval. Implicit parallelism with ordered transactions. In 12th symposium on Principles and practice of parallel programming, PPoPP '07, pages 79--89, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- L. Yen, J. Bobba, M. R. Marty, K. E. Moore, H. Volos, M. D. Hill, M. M. Swift, and D. A. Wood. LogTM-SE: Decoupling hardware transactional memory from caches. In HPCA, pages 261--272, 2007. Google ScholarDigital Library
- L. Yen, S. C. Draper, and M. D. Hill. Notary: Hardware techniques to enhance signatures. In International Symposium on Microarchitecture, pages 234--245, 2008. Google ScholarDigital Library
- H. Zhong, M. Mehrara, S. Lieberman, and S. Mahlke. Uncovering hidden loop level parallelism in sequential applications. In HPCA, pages 290--301, feb. 2008.Google Scholar
Index Terms
- SoC-TM: integrated HW/SW support for transactional memory programming on embedded MPSoCs
Recommendations
SI-TM: reducing transactional memory abort rates through snapshot isolation
ASPLOS '14: Proceedings of the 19th international conference on Architectural support for programming languages and operating systemsTransactional memory represents an attractive conceptual model for programming concurrent applications. Unfortunately, high transaction abort rates can cause significant performance degradation. Conventional transactional memory realizations not only ...
SI-TM: reducing transactional memory abort rates through snapshot isolation
ASPLOS '14Transactional memory represents an attractive conceptual model for programming concurrent applications. Unfortunately, high transaction abort rates can cause significant performance degradation. Conventional transactional memory realizations not only ...
Comments