Case studyAcceleration of the Geostatistical Software Library (GSLIB) by code optimization and hybrid parallel programming
Introduction
The Geostatistical Software Library (GSLIB), originally presented by Deutsch and Journel (1998), has been used in the geostatistical community for more than thirty years. It contains plotting utilities (histograms, probability plots, Q–Q/P–P plots, scatter plots, location maps), data transformation utilities, measures for spatial continuity (variograms), kriging estimation and stochastic simulation applications. Among these components, estimation and simulation are two of the most used components, and can be executed with large data sets and estimation/simulation grids. Large scenarios require several minutes/hours of elapsed time to finish, due to the heavy computations involved andtheir sequential implementation. Sincetheir original development,these routines have helped many researchers and practitioners in their studies, mainly due to the accuracy and performance delivered by this package. Many efforts have been proposed to accelerate or enhance the scope of the original package,WinGslib (Statios LLC, 2001), SGeMS (Remy et al., 2009) and HPGL (High performance geostatistics library, 2010) being the most relevant efforts. SGeMS and HPGL move away from Fortran and implement Python and C/C++ code in conventional and new algorithms. Although there is a significant gain with this change, for many practitioners and researchers, the simplicity of Fortran code and the availability of an extensive pool of modified GSLIB-based programs make it hard to abandon this package.
According to the authors' knowledge, few efforts have been reported in order to accelerate the GSLIB package by itself: analyzing, optimizing and accelerating the original Fortran routines. In this work we present case studies of accelerations performed on original GSLIB routines (in their Fortran 90 versions), using code optimization and multi-core programming techniques. We explain our methodology, in which a performance profile is obtained from the original routine, with the aim of identifying overhead sources in the code. After that, incremental modifications are applied to the code in order to accelerate the execution. OpenMP (Chandra et al., 2001) directives and MPI (Snir et al., 1998) instructions are added in the most time consuming parts of the routines. Similar experiences inother geostatistical codes have been reported in Straubhaar et al. (2013) and Peredo et al. (2014).
Section snippets
GSLIB structure
According to GSLIB documentation (Deutsch and Journel, 1998), the software package is composed of a set of utility routines, compiled and wrapped as a static library named gslib.a, and a set of applications that call some of the wrapped routines. We will refer to these two sets as utilities and applications. Typically, a main program and two subroutines compose an application (Fig. 1). The first subroutine is in charge of reading the parameters from the input files, and the second subroutine
Methodology
Re-design: First we have to re-design the application/utility code to identify the state of each variable, array or common block during the execution. This step is necessary to enable the user/programmer to identify the scope of each variable (data-flow analysis), in order to insert OpenMP directives into the code in a smooth and easy way.
Profiling and code optimization: After re-design, we have to study the run-time behavior of the application using a profiler tool. In our case we choose the
Case study
The proposed methodology was applied to accelerate four GSLIB applications: gamv, kt3d, sgsim and sisim. We tested the final versions of the applications in two Linux-based systems: the Server, running SUSE operating system with multiple nodes of 2×8-cores Intel Xeon CPU E526702.60 GHz interconnected through a fast Infiniband FDR10 network, and the Desktop, running openSUSE operating system with a single node of 1×4-cores Intel Xeon CPU E312253.10 GHz. All programs were compiled using GCC gfortran
Conclusions and future work
We have shown a methodology to accelerate GSLIB applications and utilities based on code optimizations and hybrid parallel programming using multi-core and multi-node execution with OpenMP directives and MPI task distribution. The methodology was tested in four well-known GSLIB applications: gamv, kt3d, sgsim and sisim. All tests were performed in Linux-based systems. However, no additional external libraries or intrinsic operating system routines were used, so the code could be compiled and
Source code
The current version of the modified codes can be downloaded from https://github.com/operedo/gslib-alges.
Acknowledgements
The authors thankfully acknowledge the computer resources, technical expertise and assistance provided by the Barcelona Supercomputing Center – Centro Nacional de Supercomputación (Spain) which supports the Marenostrum supercomputer, and the National Laboratory for High Performance Computing (Chile), which supports the Leftraru supercomputer. Additional thanks are owed to industrial supporters of ALGES laboratory, in particular Yamana Gold, as well as the Advanced Mining Technology Center
References (25)
A general parallelization strategy for random path based geostatistical simulation methods
Comput. Geosci.
(2010)- et al.
Tuning and hybrid parallelization of a genetic-based multi-point statistics simulation code
Parallel Comput.
(2014) - et al.
A conflict-free, path-level parallelization approach for sequential simulation algorithms
Comput. Geosci.
(2015) ACORNa new method for generating sequences of uniformly distributed pseudo-random numbers
J. Comput. Phys.
(1989)- et al.
CompilersPrinciples, Techniques, and Tools
(2006) - Amdahl, G.M., 1967. Validity of the single processor approach to achieving large scale computing capabilities. In:...
- et al.
The Design of OpenMP Tasks
IEEE Trans. Parallel Distrib. Syst.
(2009) - Barcelona Supercomputing Center, 2015. Paraver/extrae performance analysis tools. Computer Science Department. Website...
- et al.
Parallel Programming in OpenMP
(2001) - Culler, D., Singh, J., Gupta, A., 1998. Parallel computer architecture: a hardware/software approach. In: The Morgan...
GSLIBGeostatistical Software Library and User's Guide
gprofA Call Graph Execution Profiler
SIGPLAN Not
Cited by (8)
Acceleration strategies for large-scale sequential simulations using parallel neighbour search: Non-LVA and LVA scenarios
2022, Computers and GeosciencesCitation Excerpt :In the case of local anisotropy, each location of the domain in study presents different preferential directions of continuity (Boisvert, 2010; Boisvert and Deutsch, 2011), which is commonly known as Locally Varying Anisotropy (LVA). Regarding previous works related to accelerating large scale geostatistical simulations, novel attempts in isotropic modelling have been reported in Vargas et al. (2007), Nunes and Almeida (2010), Peredo et al. (2015) and Rasera et al. (2015), in order to accelerate classical methods using different algorithmic approaches combined with multi-core and distributed architectures, particularly MPI and OpenMP. A recent work described in Peredo et al. (2018) follows the same path, preserving the original values of the single-core execution by splitting the neighbour search and simulation steps.
Direct Multivariate Simulation - A stepwise conditional transformation for multivariate geostatistical simulation
2021, Computers and GeosciencesCitation Excerpt :The statistical analysis of multiple stochastic realizations is crucial for decision-making and risk management processes, because it allows the quantification of the uncertainty of the predictions. Several computational toolboxes are available in the public domain (Pebesma and Wesseling, 1998; Pebesma, 2004; Hansen, 2004; Goovaerts, 2010; Peredo et al., 2015; Liu and Grana, 2019; Hansen et al., 2018). The sequential simulation approach (Journel, 1994) is one of the fundamental concepts in geostatistics.
A path-level exact parallelization strategy for sequential simulation
2018, Computers and GeosciencesCitation Excerpt :The straightforward approach is the realization-level, where each realization is performed by different operating system processes or threads by changing appropriately the pseudo-random seed or other structural parameters in each run. Peredo et al. (2015) and Navarro et al. (2014) applied this approach to the SISIM and SGSIM routines from GSLIB (Deutsch and Journel, 1998). Path-level parallelization is based on the partition of the domain into zones that can be handled by different processes or threads.
Fast Gap-Filling of Massive Data by Local-Equilibrium Conditional Simulations on GPU
2023, Mathematical Geosciences