ABSTRACT
Rendering and inverse rendering techniques have recently attained powerful new capabilities and building blocks in the form of neural representations (NR), with derived rendering techniques quickly becoming indispensable tools next to classic computer graphics algorithms, covering a wide range of functions throughout the full pipeline from sensing to pixels. NRs have recently been used to directly learn the geometric and appearance properties of scenes that were previously hard to capture, and to re-synthesize photo realistic imagery based on this information, thereby promising simplifications and replacements for several complex traditional computer graphics problems and algorithms with scalable quality and predictable performance. In this work we ask the question: Does neural graphics (graphics based on NRs) need hardware support? We studied four representative neural graphics applications (NeRF, NSDF, NVR, and GIA) showing that, if we want to render 4k resolution frames at 60 frames per second (FPS) there is a gap of ~ 1.51× to 55.50× in the desired performance on current GPUs. For AR and VR applications, there is an even larger gap of ~ 2--4 orders of magnitude (OOM) between the desired performance and the required system power. We identify that the input encoding and the multi-layer perceptron kernels are the performance bottlenecks, consuming 72.37%, 60.0% and 59.96% of application time for multi resolution hashgrid encoding, multi resolution densegrid encoding and low resolution densegrid encoding, respectively. We propose a neural graphics processing cluster (NGPC) - a scalable and flexible hardware architecture that directly accelerates the input encoding and multi-layer perceptron kernels through dedicated engines and supports a wide range of neural graphics applications. To achieve good overall application level performance improvements, we also accelerate the rest of the kernels by fusion into a single kernel, leading to a ~ 9.94× speedup compared to previous optimized implementations [17] which is sufficient to remove this performance bottleneck. Our results show that, NGPC gives up to 58.36× end-to-end application-level performance improvement, for multi resolution hashgrid encoding on average across the four neural graphics applications, the performance benefits are 12.94×, 20.85×, 33.73× and 39.04× for the hardware scaling factor of 8, 16, 32 and 64, respectively. Our results show that with multi resolution hashgrid encoding, NGPC enables the rendering of 4k Ultra HD resolution frames at 30 FPS for NeRF and 8k Ultra HD resolution frames at 120 FPS for all our other neural graphics applications.
- 2022. NVIDIA GeForce RTX 3090. https://www.techpowerup.com/gpu-specs/geforce-rtx-3090.c3622Google Scholar
- 2022. Vulkan Tutorial - Graphics Pipeline Basics. https://vulkan-tutorial.com/Drawing_a_triangle/Graphics_pipeline_basics/IntroductionGoogle Scholar
- Md Zahangir Alom, Tarek M Taha, Christopher Yakopcic, Stefan Westberg, Paheding Sidike, Mst Shamima Nasrin, Brian C Van Esesn, Abdul A S Awwal, and Vijayan K Asari. 2018. The history began from alexnet: A comprehensive survey on deep learning approaches. arXiv preprint arXiv:1803.01164 (2018).Google Scholar
- Shai Avidan and Amnon Shashua. 1997. Novel view synthesis in tensor space. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 1034--1040.Google ScholarCross Ref
- Jonathan T Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul P Srinivasan. 2021. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5855--5864.Google ScholarCross Ref
- Rohan Chabra, Jan E Lenssen, Eddy Ilg, Tanner Schmidt, Julian Straub, Steven Lovegrove, and Richard Newcombe. 2020. Deep local shapes: Learning local sdf priors for detailed 3d reconstruction. In European Conference on Computer Vision. Springer, 608--625.Google ScholarDigital Library
- Robert A Drebin, Loren Carpenter, and Pat Hanrahan. 1988. Volume rendering. ACM Siggraph Computer Graphics 22, 4 (1988), 65--74.Google ScholarDigital Library
- Fabian Giesen. 2011. A trip through the Graphics Pipeline 2011. https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/Google Scholar
- Yulan Guo, Hanyun Wang, Qingyong Hu, Hao Liu, Li Liu, and Mohammed Bennamoun. 2020. Deep learning for 3d point clouds: A survey. IEEE transactions on pattern analysis and machine intelligence 43, 12 (2020), 4338--4364.Google ScholarDigital Library
- Chiyu Jiang, Avneesh Sud, Ameesh Makadia, Jingwei Huang, Matthias Nießner, Thomas Funkhouser, et al. 2020. Local implicit grid representations for 3d scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6001--6010.Google ScholarCross Ref
- Barthold Lichtenbelt, Randy Crane, and Shaz Naqvi. 1998. Introduction to volume rendering. Prentice-Hall, Inc.Google Scholar
- Stephen Robert Marschner. 1998. Inverse rendering for computer graphics. Cornell University.Google ScholarDigital Library
- Ricardo Martin-Brualla, Noha Radwan, Mehdi SM Sajjadi, Jonathan T Barron, Alexey Dosovitskiy, and Daniel Duckworth. 2021. Nerf in the wild: Neural radiance fields for unconstrained photo collections. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7210--7219.Google ScholarCross Ref
- Michael Meißner, Jian Huang, Dirk Bartz, Klaus Mueller, and Roger Crawfis. 2000. A practical evaluation of popular volume rendering algorithms. In Proceedings of the 2000 IEEE symposium on Volume visualization. 81--90.Google Scholar
- Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2021. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 1 (2021), 99--106.Google ScholarDigital Library
- Bert Moons, Bert De Brabandere, Luc Van Gool, and Marian Verhelst. 2016. Energy-efficient convnets through approximate computing. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 1--8.Google ScholarCross Ref
- Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. 2022. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG) 41, 4 (2022), 1--15.Google ScholarDigital Library
- Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. 2022. NVlabs - instant-ngp. https://github.com/NVlabs/instant-ngp.Google Scholar
- Thomas Müller, Fabrice Rousselle, Jan Novák, and Alexander Keller. 2021. Real-time neural radiance caching for path tracing. ACM Transactions on Graphics (TOG) 40, 4 (2021), 1--16.Google ScholarDigital Library
- Thomas Neff, Pascal Stadlbauer, Mathias Parger, Andreas Kurz, Joerg H Mueller, Chakravarty R Alla Chaitanya, Anton Kaplanyan, and Markus Steinberger. 2021. DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks. In Computer Graphics Forum, Vol. 40. Wiley Online Library, 45--59.Google Scholar
- Kumud Nepal, Yueting Li, R Iris Bahar, and Sherief Reda. 2014. ABACUS: A technique for automated behavioral synthesis of approximate computing circuits. In 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1--6.Google Scholar
- Baptiste Nicolet, Alec Jacobson, and Wenzel Jakob. 2021. Large steps in inverse rendering of geometry. ACM Transactions on Graphics (TOG) 40, 6 (2021), 1--13.Google ScholarDigital Library
- Angshuman Parashar, Priyanka Raina, Yakun Sophia Shao, Yu-Hsin Chen, Victor A Ying, Anurag Mukkara, Rangharajan Venkatesan, Brucek Khailany, Stephen W Keckler, and Joel Emer. 2019. Timeloop: A systematic approach to dnn accelerator evaluation. In 2019 IEEE international symposium on performance analysis of systems and software (ISPASS). IEEE, 304--315.Google ScholarCross Ref
- Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 165--174.Google ScholarCross Ref
- Gustavo Patow and Xavier Pueyo. 2003. A survey of inverse rendering problems. In Computer graphics forum, Vol. 22. Wiley Online Library, 663--687.Google Scholar
- Albert Pumarola, Enric Corona, Gerard Pons-Moll, and Francesc Moreno-Noguer. 2021. D-nerf: Neural radiance fields for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10318--10327.Google ScholarCross Ref
- Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred Hamprecht, Yoshua Bengio, and Aaron Courville. 2019. On the spectral bias of neural networks. In International Conference on Machine Learning. PMLR, 5301--5310.Google Scholar
- Purushothaman Raja and Sivagurunathan Pugazhenthi. 2012. Optimal path planning of mobile robots: A review. International journal of physical sciences 7, 9 (2012), 1314--1320.Google Scholar
- William T Reeves, David H Salesin, and Robert L Cook. 1987. Rendering antialiased shadows with depth maps. In Proceedings of the 14th annual conference on Computer graphics and interactive techniques. 283--291.Google ScholarDigital Library
- Hans Fuhan Shi and Shahram Payandeh. 2008. GPU in haptic rendering of deformable objects. In International Conference on Human Haptic Sensing and Touch Enabled Computer Applications. Springer, 163--168.Google ScholarDigital Library
- Aaron Stillmaker and Bevan Baas. 2017. Scaling equations for the accurate prediction of CMOS device performance from 180 nm to 7 nm. Integration 58 (2017), 74--81.Google ScholarCross Ref
- Rohan Tahir, Allah Bux Sargano, and Zulfiqar Habib. 2021. Voxel-based 3D object reconstruction from single 2D image using variational autoencoders. Mathematics 9, 18 (2021), 2288.Google ScholarCross Ref
- Matthew Tancik, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan Barron, and Ren Ng. 2020. Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems 33 (2020), 7537--7547.Google Scholar
- Yuk Ming Tang and Ho Lun Ho. 2020. 3D modeling and computer graphics in virtual reality. In Mixed Reality and Three-Dimensional Computer Graphics. IntechOpen.Google Scholar
- Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, et al. 2020. State of the art on neural rendering. In Computer Graphics Forum, Vol. 39. Wiley Online Library, 701--727.Google ScholarCross Ref
- Ayush Tewari, Justus Thies, Ben Mildenhall, Pratul Srinivasan, Edgar Tretschk, W Yifan, Christoph Lassner, Vincent Sitzmann, Ricardo Martin-Brualla, Stephen Lombardi, et al. 2022. Advances in neural rendering. In Computer Graphics Forum, Vol. 41. Wiley Online Library, 703--735.Google ScholarCross Ref
- Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. 2018. Pixel2mesh: Generating 3d mesh models from single rgb images. In Proceedings of the European conference on computer vision (ECCV). 52--67.Google ScholarDigital Library
- Kelly Ward, Florence Bertails, Tae-Yong Kim, Stephen R Marschner, Marie-Paule Cani, and Ming C Lin. 2007. A survey on hair modeling: Styling, simulation, and rendering. IEEE transactions on visualization and computer graphics 13, 2 (2007), 213--234.Google Scholar
- Lee Westover. 1989. Interactive volume rendering. In Proceedings of the 1989 Chapel Hill workshop on Volume visualization. 9--16.Google ScholarDigital Library
- Craig M Wittenbrink. 1998. Survey of parallel volume rendering algorithms. Hewlett Packard Laboratories.Google Scholar
- Yannan Nellie Wu, Joel S Emer, and Vivienne Sze. 2019. Accelergy: An architecture-level energy estimation methodology for accelerator designs. In 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 1--8.Google ScholarCross Ref
Index Terms
- Hardware Acceleration of Neural Graphics
Recommendations
NeuRex: A Case for Neural Rendering Acceleration
ISCA '23: Proceedings of the 50th Annual International Symposium on Computer ArchitectureThis paper presents NeuRex, an accelerator architecture that efficiently performs the modern neural rendering pipeline with an algorithmic enhancement and supporting hardware. NeuRex leverages the insights from an in-depth analysis of the state-of-the-...
Acceleration of direct volume rendering with programmable graphics hardware
We propose a method to accelerate direct volume rendering using programmable graphics hardware (GPU). In the method, texture slices are grouped together to form a texture slab. Rendering non-empty slabs from front to back viewing order generates the ...
Photon mapping on programmable graphics hardware
HWWS '03: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardwareWe present a modified photon mapping algorithm capable of running entirely on GPUs. Our implementation uses breadth-first photon tracing to distribute photons using the GPU. The photons are stored in a grid-based photon map that is constructed directly ...
Comments