research-article

FreePipe: a programmable parallel rendering architecture for efficient multi-fragment effects

Authors:
Fang Liu

Chinese Academy of Sciences

Chinese Academy of Sciences
View Profile

,
Meng-Cheng Huang

Chinese Academy of Sciences

Chinese Academy of Sciences
View Profile

,
Xue-Hui Liu

Chinese Academy of Sciences

Chinese Academy of Sciences
View Profile

,
En-Hua Wu

Chinese Academy of Sciences and University of Macau

Chinese Academy of Sciences and University of Macau
View Profile

I3D '10: Proceedings of the 2010 ACM SIGGRAPH symposium on Interactive 3D Graphics and GamesFebruary 2010Pages 75–82https://doi.org/10.1145/1730804.1730817

Published:19 February 2010Publication History

I3D '10: Proceedings of the 2010 ACM SIGGRAPH symposium on Interactive 3D Graphics and Games

Pages 75–82

ABSTRACT

In the past decade, modern GPUs have provided increasing programmability with vertex, geometry and fragment shaders. However, many classical problems have not been efficiently solved using the current graphics pipeline where some stages are still fixed functions on chip. In particular, multi-fragment effects, especially order-independent transparency, require programmability of the blending stage, that makes it difficult to be solved in a single geometry pass. In this paper we present FreePipe, a system for programmable parallel rendering that can run entirely on current graphics hardware and has performance comparable with the traditional graphics pipeline. Within this framework, two schemes for the efficient rendering of multi-fragment effects in a single geometry pass have been developed by exploiting CUDA atomic operations. Both schemes have achieved significant speedups compared to the state-of-the-art methods that are based on traditional graphics pipelines.

References

Aila, E., Miettinen, V., and Nordlund, P. 2003. Delay streams for graphics hardware. ACM Transactions on Graphics, 792--800. Google ScholarDigital Library
Akenine-Möller, T., Haines, E., and Hoffman, N. 2008. Real-Time Rendering, third ed. A. K. Peters.Google Scholar
Bavoil, L., and Myers, K. 2008. Order independent transparency with dual depth peeling. Tech. rep., NVIDIA Corporation.Google Scholar
Bavoil, L., Callahan, S. P., Lefohn, A., ao L. D. Comba, J., and Silva, C. T. 2007. Multi-fragment effects on the GPU using the k-buffer. In Proceedings of the 2007 symposium on Interactive 3D graphics and games, 97--104. Google ScholarDigital Library
Bavoil, L., Callahan, S. P., and Silva, C. T. 2008. Robust soft shadow mapping with backprojection and depth peeling. journal of graphics, gpu, and game tools 13, 1, 19--30.Google Scholar
Blythe, D. 2006. The Direct3D 10 system. ACM Transactions on Graphics 25, 3, 724--734. Google ScholarDigital Library
Carpenter, L. 1984. The A-buffer, an antialiased hidden surface method. In Proceedings of the 11th annual conference on computer graphics and interactive techniques, 103--108. Google ScholarDigital Library
Carr, N., Mech, R., and Miller, G. 2008. Coherent layer peeling for transparent high-depth-complexity scenes. In Proceedings of the 23rd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware, 33--40. Google ScholarDigital Library
Catmull, E. E. 1974. A Subdivision Algorithm for Computer Display of Curved Surfaces. PhD thesis, University of Utah. Google ScholarDigital Library
Cederman, D., and Tsigas, P. 2008. A practical quicksort algorithm for graphics processors. In Proceedings of the 16th Annual European Symposium on Algorithms, 246--258. Google ScholarDigital Library
Cook, R. L., Carpenter, L., and Catmull, E. 1987. The reyes image rendering architecture. In Computer Graphics (Proceedings of ACM SIGGRAPH 87), ACM, vol. 21, 95--102. Google ScholarDigital Library
Eisemann, E., and Décoret, X. 2006. Fast scene voxelization and applications. In SIGGRAPH 2006 Technical Sketch Program. Google ScholarDigital Library
Everitt, C. 2001. Interactive order-independent transparency. Tech. rep., NVIDIA Corporation.Google Scholar
Fatahalian, K., Luong, E., Boulos, S., Akeley, K., Mark, W. R., and Hanrahan, P. 2009. Data-parallel rasterization of micropolygons with defocus and motion blur. In Proceedings of the Conference on High Performance Graphics 2009, ACM, 59--68. Google ScholarDigital Library
Govindaraju, N. K., Henson, M., Lin, M. C., and Manocha, D. 2005. Interactive visibility ordering and transparency computations among geometric primitives in complex environments. In Proceedings of the 2005 symposium on Interactive 3D graphics and games, 49--56. Google ScholarDigital Library
Govindaraju, N. K., Raghuvanshi, N., Henson, M., Tuft, D., and Manocha, D. 2005. A cache-efficient sorting algorithm for database and data mining computations using graphics processors. Tech. rep., University of North Carolina-Chapel Hill.Google Scholar
Hasselgren, J., Munkberg, J., and Akenine-Möller, T. 2009. Automatic pre-tessellation culling. ACM Transactions on Graphics 28, 2. Google ScholarDigital Library
Horn, D. R., Sugerman, J., Houston, M., and Hanrahan, P. 2007. Interactive k-d tree gpu raytracing. In Proceedings of the 2007 symposium on Interactive 3D graphics and games, 167--174. Google ScholarDigital Library
Jon Hasselgren, T. A.-M. 2007. PCU: the programmable culling unit. ACM Transactions on Graphics, 92.Google Scholar
Jouppi, N. P., and Chang, C.-F. 1999. z³: an economical hardware technique for high-quality antialiasing and transparency. 85--93.Google Scholar
Liu, B.-Q., Wei, L.-Y., and Xu, Y.-Q. 2006. Multi-layer depth peeling via fragment sort. Tech. rep., Microsoft Research Asia.Google Scholar
Liu, F., Huang, M.-C., Liu, X.-H., and Wu, E.-H. 2009. Efficient depth peeling via bucket sort. In Proceedings of the 1th High Performance Graphics conference, 51--57. Google ScholarDigital Library
Mammen, A. 1989. Transparency and antialiasing algorithms implemented with the virtual pixel maps technique. IEEE Computer Graphics and Applications 9, 4, 43--55. Google ScholarDigital Library
Mark, W. R., and Proudfoot, K. 2001. The F-buffer: a rasterization-order fifo buffer for multi-pass rendering. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware, 57--64. Google ScholarDigital Library
Morein, S. 2000. ATI Radeon - HyperZ technology. In Proceedings of the Hot 3D Workshop on Graphics Hardware.Google Scholar
Myers, K., and Bavoil, L. 2007. Stencil routed A-Buffer. ACM SIGGRAPH 2007 Technical Sketch Program. Google ScholarDigital Library
NVIDIA. 2005. GPU programming exposed: the naked truth behind nvidia's demos. Tech. rep., NVIDIA Corporation.Google Scholar
NVIDIA. 2008. NVIDIA CUDA: Compute unified device architecture. NVIDIA Corporation.Google Scholar
Popov, S., Günther, J., Seidel, H.-P., and Slusallek, P. 2007. Stackless kd-tree traversal for high performance GPU ray tracing. Computer Graphics Forum 26, 3, 415--424.Google ScholarCross Ref
Satish, N., Harris, M., and Garland, M. 2009. Designing efficient sorting algorithms for manycore GPUs. In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium, 1--10. Google ScholarDigital Library
Segal, M., and Akeley, K. 2009. The OpenGL graphics system: A specification.Google Scholar
Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., and Hanrahan, P. 2008. Larrabee: A many-core x86 architecture for visual computing. ACM Transactions on Graphics 27, 3. Google ScholarDigital Library
Sengupta, S., Harris, M., Zhang, Y., and Owens, J. D. 2007. Scan primitives for GPU computing. In Proceedings of the 22nd ACM Siggraph/Eurographics Symposium on Graphics Hardware, 97--106. Google ScholarDigital Library
Sintorn, E., and Assarsson, U. 2008. Fast parallel GPU-sorting using a hybrid algorithm. Journal of Parallel and Distributed Computing archive 68, 1381--1388. Google ScholarDigital Library
Sugerman, J., Fatahalian, K., Boulos, S., Akeley, K., and Hanrahan, P. 2009. Gramps: A programming model for graphics pipelines. ACM Transactions on Graphics 28, 4. Google ScholarDigital Library
Tatarinov, A., and Kharlamov, A. 2009. Alternative rendering pipelines on nvidia cuda. Tech. rep., NVIDIA Corporation.Google Scholar
Wexler, D., Gritz, L., Enderton, E., and Rice, J. 2005. GPU-accelerated high-quality hidden surface removal. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, 7--14. Google ScholarDigital Library
Wittenbrink, C. M. 2001. R-buffer: a pointerless a-buffer hardware architecture. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware, 73--80. Google ScholarDigital Library
Zhou, K., Hou, Q., Wang, R., and Guo, B. 2008. Real-time KD-tree construction on graphics hardware. ACM Transactions on Graphics. Google ScholarDigital Library
Zhou, K., Hou, Q., Ren, Z., Gong, M., Sun, X., and Guo, B. 2009. Renderants: Interactive REYES rendering on GPUs. ACM Transactions on Graphics. Google ScholarDigital Library

Index Terms

FreePipe: a programmable parallel rendering architecture for efficient multi-fragment effects
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
        3D imaging
  2. Computer graphics
    1. Animation

Recommendations

Efficient depth peeling via bucket sort
HPG '09: Proceedings of the Conference on High Performance Graphics 2009

In this paper we present an efficient algorithm for multi-layer depth peeling via bucket sort of fragments on GPU, which makes it possible to capture up to 32 layers simultaneously with correct depth ordering in a single geometry pass. We exploit ...
Read More
Deep screen space
I3D '14: Proceedings of the 18th meeting of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games

Computing shading such as ambient occlusion (AO), subsurface scattering (SSS) or indirect light (GI) in screen space has recently received a lot of attention. While being efficient to compute, screen space methods have several key limitations such as ...
Read More
Hybrid fur rendering: combining volumetric fur with explicit hair strands

Hair is typically modeled and rendered using either explicitly defined hair strand geometry or a volume texture of hair densities. Taken each on their own, these two hair representations have difficulties in the case of animal fur as it consists of very ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
I3D '10: Proceedings of the 2010 ACM SIGGRAPH symposium on Interactive 3D Graphics and Games
February 2010
201 pages
ISBN:9781605589398
DOI:10.1145/1730804
Conference Chairs:
Daniel Aliaga
Purdue University
,
Manuel M. Oliveira
Universidade Federal do Rio Grande do Sul - UFRGS
,
General Chairs:
Amitabh Varshney
University of Maryland
,
Chris Wyman
University of Iowa
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 February 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
atomic operation
compute unified device architecture (CUDA)
depth peeling
graphics hardware
multi-fragment effects
order-independent transparency
programmable graphics pipeline
rasterizer
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate148of485submissions,31%
Upcoming Conference
I3D '24

Sponsor:

siggraph

Symposium on Interactive 3D Graphics and Games

May 8 - 10, 2024

Philadelphia , PA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 36
  Total Citations
  View Citations
- 1,125
  Total Downloads
- Downloads (Last 12 months)45
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

FreePipe: a programmable parallel rendering architecture for efficient multi-fragment effects

I3D '10: Proceedings of the 2010 ACM SIGGRAPH symposium on Interactive 3D Graphics and Games

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficient depth peeling via bucket sort

Deep screen space

Hybrid fur rendering: combining volumetric fur with explicit hair strands