research-article

Floating-point data compression at 75 Gb/s on a GPU

Authors:
Molly A. O'Neil

Texas State University-San Marcos

Texas State University-San Marcos
View Profile

,
Martin Burtscher

Texas State University-San Marcos

Texas State University-San Marcos
View Profile

GPGPU-4: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing UnitsMarch 2011Article No.: 7Pages 1–7https://doi.org/10.1145/1964179.1964189

Published:05 March 2011Publication History

GPGPU-4: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units

Pages 1–7

ABSTRACT

Numeric simulations often generate large amounts of data that need to be stored or sent to other compute nodes. This paper investigates whether GPUs are powerful enough to make real-time data compression and decompression possible in such environments, that is, whether they can operate at the 32- or 40-Gb/s throughput of emerging network cards. The fastest parallel CPU-based floating-point data compression algorithm operates below 20 Gb/s on eight Xeon cores, which is significantly slower than the network speed and thus insufficient for compression to be practical in high-end networks. As a remedy, we have created the highly parallel GFC compression algorithm for double-precision floating-point data. This algorithm is specifically designed for GPUs. It compresses at a minimum of 75 Gb/s, decompresses at 90 Gb/s and above, and can therefore improve internode communication throughput on current and upcoming networks by fully saturating the interconnection links with compressed data.

References

Aqrawi, A. A. and Elster, A. C. 2010. Accelerating disk access using compression for large seismic datasets on modern GPU and CPU. Para 2010 State of the Art in Scientific and Parallel Computing, extended abstract #131.Google Scholar
Balevic, A. 2009. Parallel variable-length encoding on GPGPUs. In Proceedings of the 2009 International Conference on Parallel Processing. Euro-Par'09. Springer-Verlag, Berlin, Heidelberg, 26--35. Google ScholarDigital Library
Balevic, A., Rockstroh, L., Wroblewski, M. and S. Simon. 2008. Using arithmetic coding for reduction of resulting simulation data size on massively parallel GPGPUs. In Proceedings of the 15^th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface. Springer-Verlag, Berlin, Heidelberg, 295--302. Google ScholarDigital Library
Burtscher, M. and Ratanaworabhan, P. 2009. FPC: A high-speed compressor for double-precision floating-point data. IEEE Trans. Comput. 58, 1 (January 2009), 18--31. Google ScholarDigital Library
Burtscher, M. and Ratanaworabhan, P. 2009. pFPC: A parallel compressor for floating-point data. In Proceedings of the 2009 Data Compression Conference. DCC'09. IEEE Computer Society, Washington, DC, 43--52. Google ScholarDigital Library
Bzip2. Retrieved February 1, 2011 from http://www.bzip.org/.Google Scholar
Castaño, I. 2009. High Quality DXT Compression using OpenCL for CUDA. Whitepaper. NVIDIA Corp. Retrieved February 1, 2011 from http://developer.download.nvidia.com/compute/cuda/3_0/sdk/website/OpenCL/website/OpenCL/src/oclDXTCompression/doc/opencl_dxtc.pdf.Google Scholar
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J. W. and Skadron, K. 2008. A performance study of general-purpose applications on graphics processors using CUDA. J. Parallel Distrib. Comput. 68, 10 (October 2008), 1370--1380. Google ScholarDigital Library
CUDA C Programming Guide 3.2. 2010. Retrieved February 1, 2011 from http://developer.download.nvidia.com/compute/cuda/3_2/too lkit/docs/CUDA_C_Programming_Guide.pdf.Google Scholar
FPC 1.1. 2009. Retrieved February 1, 2011 from http://www.csl.cornell.edu/~burtscher/research/FPC/.Google Scholar
Gzip. Retrieved February 1, 2011 from http://www.gzip.org/.Google Scholar
Harris, M., Sengupta, S. and Owens, J. D. 2007. Parallel prefix sum (scan) with CUDA. NVIDIA GPU Gems 3. Addison-Wesley Professional, chapter 39.Google Scholar
InfiniBand Trade Association. 2010. Retrieved February 1, 2011 from http://www.infinibandta.org/content/pages.php?pg=press_room_item&rec_id=679.Google Scholar
Ke, J., Burtscher, M. and Speight, E. 2004. Runtime compression of MPI messages to improve the performance and scalability of parallel applications. In Proceedings of the 2004 ACM/IEEE Conference on Supercomputing. SC'04. IEEE Computer Society, Washington, DC, 59--65. Google ScholarDigital Library
Lietsch, S. and Marquardt, O. 2008. A CUDA-supported approach to remote rendering. In Proceedings of the 3^rd International Conference on Advances in Visual Computing. ISVC'07. Springer -Verlag, Berlin, Heidelberg, 724--733. Google ScholarDigital Library
Lindholm, E., Nickolls, J., Oberman, S. and Montrym, J. 2008. NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro 28, 2 (March 2008), 39--55. Google ScholarDigital Library
Lindstrom, P. and Cohen, J. D. 2010. On-the-fly decompression and rendering of multiresolution terrain. In Proceedings of the 2010 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games. I3D'10. ACM, New York, NY, 65--73. Google ScholarDigital Library
Lonestar user guide. Retrieved February 1, 2011 from http://services.tacc.utexas.edu/index.php/lonestar-user-guide.Google Scholar
Longhorn user guide. Retrieved February 1, 2011 from http://services.tacc.utexas.edu/index.php/longhorn-user-guide.Google Scholar
Lzop. Retrieved February 1, 2011 from http://www.lzop.org/.Google Scholar
pFPC v1.0. 2009. Retrieved February 1, 2011 from http://users.ices.utexas.edu/~burtscher/research/pFPC/.Google Scholar
Scientific IEEE 754 64-Bit Double-Precision Floating-Point Datasets. 2009. Retrieved February 1, 2011 from http://www.csl.cornell.edu/~burtscher/research/FPC/datasets. html.Google Scholar
Top500 fastest supercomputers. Retrieved February 1, 2011 from http://www.top500.org/.Google Scholar

Index Terms

Floating-point data compression at 75 Gb/s on a GPU

Recommendations

A general purpose lossless data compression method for GPU

The paper describes a parallel method for a lossless data compression that uses graphical processing units (GPUs). Two commonly used statistical and dictionary approaches to data compression have been applied in our method. The reduction of compression ...
Read More
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance Computing

The graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...
Read More
Optimized HPL for AMD GPU and multi-core CPU usage

The installation of the LOEWE-CSC ( http://csc.uni-frankfurt.de/csc/__ __51 ) supercomputer at the Goethe University in Frankfurt lead to the development of a Linpack which can fully utilize the installed AMD Cypress GPUs. At its core, a fast DGEMM for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

GPGPU-4: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
March 2011
101 pages
ISBN:9781450305693
DOI:10.1145/1964179

Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 March 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
GPGPU
floating-point data
lossless data compression
real-time compression
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate57of129submissions,44%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 43
  Total Citations
  View Citations
- 505
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Floating-point data compression at 75 Gb/s on a GPU

GPGPU-4: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units

ABSTRACT

References

Cited By

Index Terms

Recommendations

A general purpose lossless data compression method for GPU

On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing

Optimized HPL for AMD GPU and multi-core CPU usage

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Floating-point data compression at 75 Gb/s on a GPU

GPGPU-4: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units

ABSTRACT

References

Cited By

Index Terms

Recommendations

A general purpose lossless data compression method for GPU

On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing

Optimized HPL for AMD GPU and multi-core CPU usage

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media