Abstract
The complexity and the diversity of computer architectures have dramaticaly evolved over the last decade, which makes it impossible to manually optimize codes for all these architectures. In addition, compilers must remain conservative with respect to their optimization choices because of their static cost model. One way to guide them is to use feedback data from data profiling of a representative training dataset (FDO/PGO) for a given application. It then becomes possible, based on that knowledge, to add specific compiler directives and/or flags to enhance performance. Moreover, automatic transformations simplifying portions of the application (e.g. specialization) can be applied. In this paper we present ASSIST, a directive-oriented source-to-source manipulation tool that aims at providing such assistance. The tool is integrated into the MAQAO toolset and takes advantage of all the available static and dynamic profiling data produced by the other tools. It also features a set of code transformations triggered by directives. The combination of both leads to an autotuning process that helps users to keep their code as generic as possible whilst also benefiting from a performance gain related to feedback or user knowledge. We demonstrate how we can build a compiler’s PGO-like tool and compare our first results to the Intel compiler PGO mode.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Amaral, J.N., Berube, P.: Aestimo: a Feedback-Directed Optimization Evaluation Tool. IEEE, Piscataway, NJ, USA (2006)
Barthou, D., Rubial, A.C., Jalby, W., Koliai, S., Valensi, C.: Performance tuning of x86 openmp codes with maqao. In: Parallel Tools Workshop, pp. 95–113. Desden, Germany, September 2009. Springer
Bendifallah, Z., Jalby, W., Noudohouenou, J., Oseret, E., Palomares, V., Rubial, A.C.: PAMDA: performance assessment using MAQAO toolset and differential analysis, pp. 107–127. Springer International Publishing, Cham (2014)
Bodin, F., Dolbeau, R., Bihan, S.: Hmpp: a hybrid multi-core parallel programming environment. In: Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 2007), vol. 28 (2007)
Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: ACM SIGPLAN Notices, pp. 101–113. ACM (2008)
Charif Rubial, A.S., Lereste, J.-B.: https://www.maqao.org/release/MAQAO.Tutorial.LProf.v1.pdf
Charif-Rubial, A.S., Barthou, D., Valensi, C., Shende, S., Malony, A., Jalby, W.: Mil: a language to build program analysis tools through static binary instrumentation. In: 20th Annual International Conference on High Performance Computing, pp. 206–215, Dec 2013
Chen, D., Xinliang Li, D., Moseley, T.: Autofdo: automatic feedback-directed optimization for warehouse-scale applications. In: Proceedings of the 2016 International Symposium on Code Generation and Optimization, CGO 2016, pp. 12–23. ACM, New York, NY, USA (2016)
Chris Lattner et Vikram Adve. Dms/spl reg: program transformations for practical scalable software evolution. In: Proceedings of the 26th International Conference on Software Engineering, ICSE 2004, pp. 625–634. IEEE (2004)
Chris Lattner et Vikram Adve. Llvm a compilation framework for lifelong program analysis and transformation. In: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization, p. 75. IEEE Computer (2004)
Chun Chen, J.C., Hall, M.: Chill: a framework for composing high-level loop transformations, June 2008
Cordy, J.R.: Source transformation, analysis and generation in txl. In: Proceedings of the 2006 ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation, PEPM 2006, pp. 1–11. ACM, New York, NY, USA (2006)
Dave et al.: Cetus: a source-to-source compiler infrastructure for multicores. Computer, 36—42, December 2009
Gonze, X. et al.: Abinit: first-principles approach to material and nanosystem properties. Comput. Phys. Commun., 2582–2615. Elsevier (2009)
Hartono, A., Norris, B., Sadayappan, P.: Annotation-based empirical performance tuning using orio. In: 2009 IEEE International Symposium on Parallel Distributed Processing, pp. 1–11, May 2009
Irigoin et al: Interprocedural analyses forprogramming environments. In: Workshop on Evironments and Tools For Parallel Scientifc Computing, Saint-Hilaire du Touvier, France, August 1992
Koliaï, S., Bendifallah, Z., Tribalat, M., Valensi, C., Acquaviva, J.-T., Jalby, W.: Quantifying performance bottleneck cost through differential analysis. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS 2013, pp. 263–272. ACM, New York, NY, USA, (2013)
MAQAO toolsuite. http://www.maqao.org
Novillo, D.: Samplepgo: the power of profile guided optimizations without the usability burden. In: Proceedings of the 2014 LLVM Compiler Infrastructure in HPC, LLVM-HPC 2014, pp. 22–28. IEEE Press, Piscataway, NJ, USA (2014)
Palkowski, M., Bielecki, W.: TRACO Parallelizing Compiler, pp. 409–421. Springer International Publishing, Cham (2015)
Paraformance. http://paraformance.weebly.com/
Paul Klint, J.V., van der Storm, T.: Rascal a domain specific language for source code analysis ad manipulation. In: IEEE International Working Conference on Source Code Analysis and Manipulation, pp. 168–177. IEEE Computer Society (2009)
Quinlan et al.: Rose: compiler support for object-oriented framework. In: Parallel Processing Letters, pp. 215—226. Lawrence Livermore National Laboratory, Livermore, CA, USA, October 2000. World Scientific
Rubial, A.C., Oseret, E., Noudohouenou, J., Jalby, W., Lartigue, G.: CQA: a code quality analyzer tool at binary level. In: HiPC, pp. 1–10. IEEE Computer Society (2014)
Rudgyard, M., Schonfeld, T.: Steady and unsteady flow simulationsusing the hybrid flow solver avbp. AIAA J., 1378–1385. AIAA ARC (1999)
Takizawa, H., Suda, R., Hirasawa, S.: Xevtgen: fortran code transformer generator for high performance scienti c codes. Int. J. Network. Comput., 263—289 (2016)
Verdoolaege, S., et al.: Polyhedral parallel code generation for cuda. ACM Trans. Architec. Code Optim. ACM, January 2013
Vermaas, R., Bravenboer, M., Kalleberg, K.T., Visser, E.: Stratego/xt 0.17. a language and toolset for program transformation. In: Science of Computer Programming. Elsevier, May 2008
Wu, C., Lian, R., Zhang, J., Ju, R., Chan, S., Liu, L., Feng, X., Zhang, Z.: An Overview of the Open Research Compiler, pp. 17–31. Springer, Berlin Heidelberg, Berlin, Heidelberg (2005)
Xiao, X., Hirasawa, S., Takizawa, H., Kobayashi, H.: An approach to customization of compiler directives for application-specific code transformations. In: 2014 IEEE 8th International Symposium on Embedded Multicore/Manycore SoCs, pp. 99–106, Sept 2014
Yi, Q.: Poet: a scripting language for applying parameterized source-to-source program transformations. In: Software Practice And Experience, pp. 675–706. University of Texas at San Antonio, USA, May 2012. John Wiley and Sons
Acknowledgements
We would like to thank Gabriel Staffelbach (CERFACS) for providing our laboratory with the AVBP application, as well as Ghislain Lartigue and Vincent Moureau (CORIA) for providing us with YALES2. This work has been carried out by the Li-PaRAD laboratory, PeXL and the Exascale Computing Research laboratory, with the support of CEA, Intel, UVSQ. Intel granted us dedicted access to a Skylake SP machine on which the experiments were run. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the CEA, Intel, or UVSQ.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Lebras, Y., Charif Rubial, A.S., Dolbeau, R., Jalby, W. (2019). ASSIST: An FDO Source-to-Source Transformation Tool for HPC Applications. In: Niethammer, C., Resch, M., Nagel, W., Brunst, H., Mix, H. (eds) Tools for High Performance Computing 2017. PTHPC 2017. Springer, Cham. https://doi.org/10.1007/978-3-030-11987-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-11987-4_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11986-7
Online ISBN: 978-3-030-11987-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)