Abstract
The polyhedron model is a powerful model to identify and apply systematically loop transformations that improve data locality (e.g., via tiling) and enable parallelization. In the polyhedron model, a loop transformation is, essentially, represented as an affine function. Well-established algorithms for the discovery of promising transformations are based on performance models. These algorithms have the drawback of not being easily adaptable to the characteristics of a specific program or target hardware. An iterative search for promising loop transformations is more easily adaptable and can help to learn better models. We present an iterative optimization method in the polyhedron model that targets tiling and parallelization. The method enables either a sampling of the search space of legal loop transformations at random or a more directed search via a genetic algorithm. For the latter, we propose a set of novel, tailored reproduction operators. We evaluate our approach against existing iterative and model-driven optimization strategies. We compare the convergence rate of our genetic algorithm to that of random exploration. Our approach of iterative optimization outperforms existing optimization techniques in that it finds loop transformations that yield significantly higher performance. If well configured, then random exploration turns out to be very effective and reduces the need for a genetic algorithm.
Supplemental Material
Available for Download
Slide deck associated with this paper
- T. W. Anderson and J. D. Finn. 1996. The New Statistical Analysis of Data. Springer.Google Scholar
- S. Balev, P. Quinton, S. V. Rajopadhye, and T. Risset. 1998. Linear programming models for scheduling systems of affine recurrence equations—A comparative study. In Proceedings of the 10th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA’98). ACM Press, 250--258. Google ScholarDigital Library
- M.-W. Benabderrahmane, L.-N. Pouchet, A. Cohen, and C. Bastoul. 2010. The polyhedral model is more widely applicable than you think. In Compiler Construction. Lecture Notes in Computer Science, Vol. 6011, Rajiv Gupta (Ed.). Springer, 283--303. Google ScholarDigital Library
- Y. Benjamini and Y. Hochberg. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Stat. Soc. Ser. B 57, 1 (1995), 289--300.Google ScholarCross Ref
- U. Bondhugula and others. 2008. Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In Compiler Construction. Lecture Notes in Computer Science, Vol. 4959, Laurie Hendren (Ed.). Springer, 132--146. Google ScholarDigital Library
- U. Bondhugula, A. Acharya, and A. Cohen. 2016. The Pluto+ algorithm: A practical approach for parallelization and locality optimization of affine loop nests. ACM Trans. Program. Lang. Syst. 38, 3 (May 2016), 12:1--12:32. Google ScholarDigital Library
- J. Clarke and others. 2003. Reformulating software engineering as a search problem. IEEE Proc. Softw. 150, 3 (June 2003), 161--175.Google ScholarCross Ref
- P. Feautrier. 1988. Parametric integer programming. RAIRO Operat. Res. 22, 3 (1988), 243--268.Google ScholarCross Ref
- P. Feautrier. 1991. Dataflow analysis of array and scalar references. Int. J. Par. Prog. 20, 1 (1991), 23--53.Google ScholarCross Ref
- P. Feautrier. 1992. Some efficient solutions to the affine scheduling problem. Part I. One-dimensional time. Int. J. Par. Prog. 21, 5 (1992), 313--347. Google ScholarDigital Library
- P. Feautrier. 1992. Some efficient solutions to the affine scheduling problem. Part II. multidimensional time. Int. J. Par. Prog. 21, 6 (1992), 389--420. Google ScholarDigital Library
- P. Feautrier and C. Lengauer. 2011. Polyhedron model. In Encyclopedia of Parallel Computing, Vol. 3, D. Padua and others (Eds.). Springer, 1581--1591.Google Scholar
- M. Griebl, P. Feautrier, and C. Lengauer. 2000. Index set splitting. Int. J. Par. Prog. 28, 6 (Dec. 2000), 607--631.Google Scholar
- T. Grosser, A. Größlinger, and C. Lengauer. 2012. Polly -- Performing polyhedral optimizations on a low-level intermediate representation. Par. Proc. Lett. 22, 4 (2012), article 1250010, 28 pages.Google ScholarCross Ref
- T. Grosser, S. Verdoolaege, and A. Cohen. 2015. Polyhedral AST generation is more than scanning polyhedra. ACM Trans. Program. Lang. Syst. 37, 4 (Aug. 2015), 12:1--12:50. Google ScholarDigital Library
- M. Harman. 2007. The current state and future of search based software engineering. In Proceedings of the Workshop on the Future of Software Engineering (FOSE’07). IEEE Computer Society, 342--357. Google ScholarDigital Library
- F. Irigoin. 2011. Tiling. In Encyclopedia of Parallel Computing, Vol. 4, D. Padua and others (Eds.). Springer, 2040--2049.Google Scholar
- W. Kelly and W. Pugh. 1995. A unifying framework for iteration reordering transformations. In Proceedings of the IEEE First International Conference on Algorithms and Architectures for Parallel Processing (ICAPP’95), Vol. 1. IEEE, 153--162.Google Scholar
- K. Kennedy and K. S. McKinley. 1993. Maximizing loop parallelism and improving data locality via loop fusion and distribution. In Languages and Compilers for Parallel Computing. Lecture Notes in Computer Science, Vol. 768, U. Banerjee and others (Eds.). Springer, 301--320. Google ScholarDigital Library
- A. Kleen. 2004. An NUMA API for Linux. Technical Report. SUSE Labs.Google Scholar
- P. R. Krishnaiah and P. K. Sen. 1984. Handbook of Statistics, Vol. 4. Elsevier.Google Scholar
- C. Lattner. 2008. LLVM and clang: Next generation compiler technology. In Proceedings of the BSD Conference (BSDCan’08).Google Scholar
- H. Le Verge. 1994. A Note on Chernikova’s Algorithm. Res. Report RR-1662. INRIA.Google Scholar
- S. Long and G. Fursin. 2009. Systematic search within an optimisation space based on unified transformation framework. Int. J. Comput. Sci. Eng. 4, 2 (2009), 102--111. Google ScholarDigital Library
- S. Long and M. F. P. O’Boyle. 2004. Adaptive java optimisation using instance-based learning. In Proceedings of the 18th Annual International Conference on Supercomputing (ICS’04). ACM, 237--246. Google ScholarDigital Library
- M. Mitchell. 1998. An Introduction to Genetic Algorithms. MIT Press. Google ScholarDigital Library
- A. Nisbet. 1998. GAPS: A compiler framework for genetic algorithm (GA) optimised parallelisation. In High-Performance Computing and Networking (HPCN Europe), P. Sloot, M. Bubak, and B. Hertzberger (Eds.). Springer, 987--989. Google ScholarDigital Library
- A. Nisbet. 2001. Towards retargettable compilers -- Feedback directed compilation using genetic algorithms. In Proceedings of the 9th International Workshop on Compilers for Parallel Computers (CPC’01).Google Scholar
- M. Odersky, L. Spoon, and B. Venners. 2008. Programming in Scala. Artima.Google Scholar
- D. Padua. 2011. Parallelization, automatic. In Encyclopedia of Parallel Computing, D. Padua and others (Eds.). Vol. 3. Springer, 1442--1450.Google Scholar
- L.-N. Pouchet. 2012. LeTSeE—The LEgal Transformation SpacE Explorator. Retrieved from http://web.cs.ucla.edu/ pouchet/software/letsee/.Google Scholar
- L.-N. Pouchet and others. 2007. Iterative optimization in the polyhedral model: Part I, one-dimensional time. In Proceedings of the 5th International Symposium on Code Generation and Optimization (CGO’07). IEEE Computer Society, 144--156. Google ScholarDigital Library
- L.-N. Pouchet and others. 2008. Iterative optimization in the polyhedral model: Part II, multidimensional time. In Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI’08). ACM, 90--100. Google ScholarDigital Library
- L.-N. Pouchet and others. 2010. Combined iterative and model-driven optimization in an automatic parallelization framework. In Proceedings ACM/IEEE International Conference on High Performance Computing, Networking, Storage and Analysis (SC’10). IEEE Computer Society, 1--11. Google ScholarDigital Library
- L.-N. Pouchet and T. Yuki. 2015. PolyBench 4.1. Retrieved May2015 from http://web.cse.ohio-state.edu/∼pouchet/software/polybench/.Google Scholar
- A. Schrijver. 1994. Theory of Linear and Integer Programming. John Wiley & Sons.Google Scholar
- K. Trifunovic and others. 2010. GRAPHITE two years after: First lessons learned from real-world polyhedral compilation. In Proceedings of the International Workshop on GCC Research Opportunities (GROW’10). 1--13.Google Scholar
- R. Upadrasta and A. Cohen. 2013. Sub-polyhedral scheduling using (unit-)two-variable-per-inequality polyhedra. In Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’13). ACM, 483--496. Google ScholarDigital Library
- S. Verdoolaege. 2010. isl: An integer set library for the polyhedral model. In Proceedings of the International Conference on Mathematical Software (ICMS’10), K. Fukuda and others (Eds.). Springer, 299--302. Google ScholarDigital Library
- M. Wolfe. 1986. Loops skewing: The wavefront method revisited. Int. J. Par. Prog. 15, 4 (Aug. 1986), 279--293. Google ScholarDigital Library
Index Terms
- Iterative Schedule Optimization for Parallelization in the Polyhedron Model
Recommendations
Speeding up Iterative Polyhedral Schedule Optimization with Surrogate Performance Models
Iterative program optimization is known to be able to adapt more easily to particular programs and target hardware than model-based approaches. An approach is to generate random program transformations and evaluate their profitability by applying them ...
Hybrid Taguchi-genetic algorithm for global numerical optimization
In this paper, a hybrid Taguchi-genetic algorithm (HTGA) is proposed to solve global numerical optimization problems with continuous variables. The HTGA combines the traditional genetic algorithm (TGA), which has a powerful global exploration capability,...
Comments