Skip to main content

Maintaining Trust in Reduction: Preserving the Accuracy of Quantities of Interest for Lossy Compression

  • Conference paper
  • First Online:
Book cover Driving Scientific and Engineering Discoveries Through the Integration of Experiment, Big Data, and Modeling and Simulation (SMC 2021)

Abstract

As the growth of data sizes continues to outpace computational resources, there is a pressing need for data reduction techniques that can significantly reduce the amount of data and quantify the error incurred in compression. Compressing scientific data presents many challenges for reduction techniques since it is often on non-uniform or unstructured meshes, is from a high-dimensional space, and has many Quantities of Interests (QoIs) that need to be preserved. To illustrate these challenges, we focus on data from a large scale fusion code, XGC. XGC uses a Particle-In-Cell (PIC) technique which generates hundreds of PetaBytes (PBs) of data a day, from thousands of timesteps. XGC uses an unstructured mesh, and needs to compute many QoIs from the raw data, f.

One critical aspect of the reduction is that we need to ensure that QoIs derived from the data (density, temperature, flux surface averaged momentums, etc.) maintain a relative high accuracy. We show that by compressing XGC data on the high-dimensional, nonuniform grid on which the data is defined, and adaptively quantizing the decomposed coefficients based on the characteristics of the QoIs, the compression ratios at various error tolerances obtained using a multilevel compressor (MGARD) increases more than ten times. We then present how to mathematically guarantee that the accuracy of the QoIs computed from the reduced f is preserved during the compression. We show that the error in the XGC density can be kept under a user-specified tolerance over 1000 timesteps of simulation using the mathematical QoI error control theory of MGARD, whereas traditional error control on the data to be reduced does not guarantee the accuracy of the QoIs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chang, C.-S., et al.: Spontaneous rotation sources in a quiescent tokamak edge plasma. Phys. Plasmas 15(6), 062510 (2008)

    Article  Google Scholar 

  2. Chang, C.-S., et al.: Compressed ion temperature gradient turbulence in diverted tokamak edge. Phys. Plasmas 16(5), 056108 (2009)

    Article  Google Scholar 

  3. Hager, R., et al.: Gyrokinetic study of collisional resonant magnetic perturbation (RMP)-driven plasma density and heat transport in tokamak edge plasma using a magnetohydrodynamic screened RMP field. Nucl. Fusion 59(12), 126009 (2019)

    Article  Google Scholar 

  4. Jesse, S., et al.: Using multivariate analysis of scanning-Rochigram data to reveal material functionality. Microsc. Microanal. 22(S3), 292–293 (2016)

    Article  Google Scholar 

  5. https://www.olcf.ornl.gov/2021/02/18/scientists-use-supercomputers-tostudy-reliable-fusion-reactor-design-operation (2021, Online)

  6. Rebut, P.-H.: ITER: the first experimental fusion reactor. Fusion Eng. Des. 30(1–2), 85–118 (1995)

    Article  Google Scholar 

  7. Ku, S.-H., et al.: Full-f gyrokinetic particle simulation of centrally heated global ITG turbulence from magnetic axis to edge pedestal top in a realistic tokamak geometry. Nucl. Fusion 49(11), 115021 (2009)

    Article  Google Scholar 

  8. Dominski, J., et al.: Spatial coupling of gyrokinetic simulations, a generalized scheme based on first-principles. Phys. Plasmas 28(2), 022301 (2021)

    Article  Google Scholar 

  9. Wolfram Jr, et al.: Global to Coastal Multiscale Modeling via Land-river-ocean Coupling in the Energy Exascale Earth System Model (E3SM). No. LA-UR-20-24263. Los Alamos National Lab. (LANL), Los Alamos, NM (United States) (2020)

    Google Scholar 

  10. Ratanaworabhan, P., et al.: Fast lossless compression of scientific floating-point data. In: Data Compression Conference, DCC 2006 (2006)

    Google Scholar 

  11. Liang, X., et al.: Error-controlled lossy compression optimized for high compression ratios of scientific datasets. In: 2018 IEEE International Conference on Big Data (Big Data). IEEE (2018)

    Google Scholar 

  12. Lindstrom, P.: Fixed-rate compressed floating-point arrays. IEEE Trans. Vis. Comput. Graph. 20(12), 2674–2683 (2014)

    Article  Google Scholar 

  13. Ainsworth, M., et al.: Multilevel techniques for compression and reduction of scientific data-the multivariate case. SIAM J. Sci. Comput. 41(2), A1278–A1303 (2019)

    Article  MathSciNet  Google Scholar 

  14. Ainsworth, M., et al.: Multilevel techniques for compression and reduction of scientific data-quantitative control of accuracy in derived quantities. SIAM J. Sci. Comput. 41(4), A2146–A2171 (2019)

    Article  MathSciNet  Google Scholar 

  15. Ainsworth, M., et al.: Multilevel techniques for compression and reduction of scientific data-the unstructured case. SIAM J. Sci. Comput. 42(2), A1402–A1427 (2020)

    Article  MathSciNet  Google Scholar 

  16. Choi, J., et al.: Generative fusion data compression. In: Neural Compression: From Information Theory to Applications-Workshop ICLR (2021)

    Google Scholar 

  17. https://github.com/CODARcode/MGARD/blob/master/README_MGARD_GPU.md

  18. https://github.com/LLNL/zfp

  19. https://github.com/szcompressor/SZ

  20. Hines, J.: Stepping up to summit. Comput. Sci. Eng. 20(2), 78–82 (2018)

    Article  Google Scholar 

  21. Faghihi, D., et al.: Moment preserving constrained resampling with applications to particle-in-cell methods. J. Comput. Phys. 409, 109317 (2020)

    Article  MathSciNet  Google Scholar 

  22. Jackson, M., et al.: Reservoir modeling for flow simulation by use of surfaces, adaptive unstructured meshes, and an overlapping-control-volume finite-element method. SPE Reservoir Eval. Eng. 18(02), 115–132 (2015)

    Article  Google Scholar 

  23. Alted, F.: Blosc, an extremely fast, multi-threaded, meta-compressor library (2017)

    Google Scholar 

  24. Burtscher, M., et al.: FPC: a high-speed compressor for double-precision floating-point data. IEEE Trans. Comput. 58(1), 18–31 (2008)

    Article  MathSciNet  Google Scholar 

  25. https://facebook.github.io/zstd/. Accessed 2021

  26. Chen, J., et al.: Understanding performance-quality trade-offs in scientific visualization workflows with lossy compression. In: 2019 IEEE/ACM 5th International Workshop on Data Analysis and Reduction for Big Scientific Data (2019)

    Google Scholar 

  27. Lu, T., et al.: Understanding and modeling lossy compression schemes on HPC scientific data. In: 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE (2018)

    Google Scholar 

  28. Liang, X., et al.: MGARD+: optimizing multi-grid based reduction for efficient scientific data management. IEEE Trans. Comput. (2021, to appear)

    Google Scholar 

  29. Chen, J., et al.: Accelerating Multigrid-Based Hierarchical Scientific Data Refactoring on GPUs. arXiv preprint arXiv:2007.04457 (2020)

  30. Tian, J., et al.: cuSZ: an efficient GPU-based error-bounded lossy compression framework for scientific data. In: Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques (2020)

    Google Scholar 

  31. Lindstrom, P., et al.: cuZFP. https://github.com/LLNL/zfp/tree/develop/src/cuda_zfp

  32. Wallace, G.K.: The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), xviii–xxxiv (1992)

    Google Scholar 

  33. Rabbani, M.: JPEG2000: image compression fundamentals, standards and practice. J. Electron. Imaging 11(2), 286 (2002)

    Article  Google Scholar 

Download references

Acknowledgement

This research was supported by the ECP CODAR, Sirius-2, and RAPIDS-2 projects through the Advanced Scientific Computing Research (ASCR) program of Department of Energy, and the LDRD project through DRD program of Oak Ridge National Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qian Gong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gong, Q. et al. (2022). Maintaining Trust in Reduction: Preserving the Accuracy of Quantities of Interest for Lossy Compression. In: Nichols, J., et al. Driving Scientific and Engineering Discoveries Through the Integration of Experiment, Big Data, and Modeling and Simulation. SMC 2021. Communications in Computer and Information Science, vol 1512. Springer, Cham. https://doi.org/10.1007/978-3-030-96498-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-96498-6_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-96497-9

  • Online ISBN: 978-3-030-96498-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics