Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Towards real-time photorealistic 3D holography with deep neural networks

An Author Correction to this article was published on 26 April 2021

This article has been updated

Abstract

The ability to present three-dimensional (3D) scenes with continuous depth sensation has a profound impact on virtual and augmented reality, human–computer interaction, education and training. Computer-generated holography (CGH) enables high-spatio-angular-resolution 3D projection via numerical simulation of diffraction and interference1. Yet, existing physically based methods fail to produce holograms with both per-pixel focal control and accurate occlusion2,3. The computationally taxing Fresnel diffraction simulation further places an explicit trade-off between image quality and runtime, making dynamic holography impractical4. Here we demonstrate a deep-learning-based CGH pipeline capable of synthesizing a photorealistic colour 3D hologram from a single RGB-depth image in real time. Our convolutional neural network (CNN) is extremely memory efficient (below 620 kilobytes) and runs at 60 hertz for a resolution of 1,920 × 1,080 pixels on a single consumer-grade graphics processing unit. Leveraging low-power on-device artificial intelligence acceleration chips, our CNN also runs interactively on mobile (iPhone 11 Pro at 1.1 hertz) and edge (Google Edge TPU at 2.0 hertz) devices, promising real-time performance in future-generation virtual and augmented-reality mobile headsets. We enable this pipeline by introducing a large-scale CGH dataset (MIT-CGH-4K) with 4,000 pairs of RGB-depth images and corresponding 3D holograms. Our CNN is trained with differentiable wave-based loss functions5 and physically approximates Fresnel diffraction. With an anti-aliasing phase-only encoding method, we experimentally demonstrate speckle-free, natural-looking, high-resolution 3D holograms. Our learning-based approach and the Fresnel hologram dataset will help to unlock the full potential of holography and enable applications in metasurface design6,7, optical and acoustic tweezer-based microscopic manipulation8,9,10, holographic microscopy11 and single-exposure volumetric 3D printing12,13.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Tensor holography workflow for learning Fresnel holograms from RGB-D images.
Fig. 2: Performance evaluation of the OA-PBM and tensor holography CNN.
Fig. 3: Experimental demonstration of 2D and 3D holographic projection.

Similar content being viewed by others

Data availability

Our hologram dataset (MIT-CGH-4K) and the trained CNN model will be made publicly available (on GitHub) along with the paper.

Code availability

The code to evaluate the trained CNN model will be made publicly available (on GitHub) along with the paper. Additional codes are available from the corresponding authors upon reasonable request.

Change history

References

  1. Benton, S. A., Bove, J. & Michael, V. Holographic Imaging (John Wiley & Sons, 2008).

  2. Maimone, A., Georgiou, A. & Kollin, J. S. Holographic near-eye displays for virtual and augmented reality. ACM Trans. Graph. 36, 85:1–85:16 (2017).

    Article  Google Scholar 

  3. Shi, L., Huang, F.-C., Lopes, W., Matusik, W. & Luebke, D. Near-eye light field holographic rendering with spherical waves for wide field of view interactive 3D computer graphics. ACM Trans. Graph. 36, 236:1–236:17 (2017).

    Article  Google Scholar 

  4. Tsang, P. W. M., Poon, T.-C. & Wu, Y. M. Review of fast methods for point-based computer-generated holography [Invited]. Photon. Res. 6, 837–846 (2018).

    Article  Google Scholar 

  5. Sitzmann, V. et al. End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging. ACM Trans. Graph. 37, 114:1–114:13 (2018).

    Article  Google Scholar 

  6. Lee, G.-Y. et al. Metasurface eyepiece for augmented reality. Nat. Commun. 9, 4562 (2018).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  7. Hu, Y. et al. 3d-integrated metasurfaces for full-colour holography. Light Sci. Appl. 8, 86 (2019).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  8. Melde, K., Mark, A. G., Qiu, T. & Fischer, P. Holograms for acoustics. Nature 537, 518–522 (2016).

    Article  ADS  CAS  PubMed  Google Scholar 

  9. Smalley, D. et al. A photophoretic-trap volumetric display. Nature 553, 486–490 (2018).

    Article  ADS  CAS  PubMed  Google Scholar 

  10. Hirayama, R., Plasencia, D. M., Masuda, N. & Subramanian, S. A volumetric display for visual, tactile and audio presentation using acoustic trapping. Nature 575, 320–323 (2019).

    Article  ADS  CAS  PubMed  Google Scholar 

  11. Rivenson, Y., Wu, Y. & Ozcan, A. Deep learning in holography and coherent imaging. Light Sci. Appl. 8, 85 (2019).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  12. Shusteff, M. et al. One-step volumetric additive manufacturing of complex polymer structures. Sci. Adv. 3, eaao5496 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Kelly, B. E. et al. Volumetric additive manufacturing via tomographic reconstruction. Science 363, 1075–1079 (2019).

    Article  ADS  CAS  PubMed  Google Scholar 

  14. Levoy, M. & Hanrahan, P. Light field rendering. In Proc. 23rd Annual Conference on Computer Graphics and Interactive Techniques 31–42 (ACM, 1996).

  15. Waters, J. P. Holographic image synthesis utilizing theoretical methods. Appl. Phys. Lett. 9, 405–407 (1966).

    Article  ADS  Google Scholar 

  16. Leseberg, D. & Frère, C. Computer-generated holograms of 3-D objects composed of tilted planar segments. Appl. Opt. 27, 3020–3024 (1988).

    Article  ADS  CAS  PubMed  Google Scholar 

  17. Tommasi, T. & Bianco, B. Computer-generated holograms of tilted planes by a spatial frequency approach. J. Opt. Soc. Am. A 10, 299–305 (1993).

    Article  ADS  Google Scholar 

  18. Matsushima, K. & Nakahara, S. Extremely high-definition full-parallax computer-generated hologram created by the polygon-based method. Appl. Opt. 48, H54–H63 (2009).

    Article  PubMed  Google Scholar 

  19. Symeonidou, A., Blinder, D., Munteanu, A. & Schelkens, P. Computer-generated holograms by multiple wavefront recording plane method with occlusion culling. Opt. Express 23, 22149–22161 (2015).

    Article  ADS  PubMed  Google Scholar 

  20. Lucente, M. E. Interactive computation of holograms using a look-up table. J. Electron. Imaging 2, 28–35 (1993).

    Article  ADS  Google Scholar 

  21. Lucente, M. & Galyean, T. A. Rendering interactive holographic images. In Proc. 22nd Annual Conference on Computer Graphics and Interactive Techniques, 387–394 (ACM, 1995).

  22. Lucente, M. Interactive three-dimensional holographic displays: seeing the future in depth. Comput. Graph. 31, 63–67 (1997).

    Article  Google Scholar 

  23. Chen, J.-S. & Chu, D. P. Improved layer-based method for rapid hologram generation and real-time interactive holographic display applications. Opt. Express 23, 18143–18155 (2015).

    Article  ADS  PubMed  Google Scholar 

  24. Zhao, Y., Cao, L., Zhang, H., Kong, D. & Jin, G. Accurate calculation of computer-generated holograms using angular-spectrum layer-oriented method. Opt. Express 23, 25440–25449 (2015).

    Article  ADS  CAS  PubMed  Google Scholar 

  25. Makey, G. et al. Breaking crosstalk limits to dynamic holography using orthogonality of high-dimensional random vectors. Nat. Photon. 13, 251–256 (2019).

    Article  ADS  CAS  Google Scholar 

  26. Yamaguchi, M., Hoshino, H., Honda, T. & Ohyama, N. in Practical Holography VII: Imaging and Materials Vol. 1914 (ed. Benton, S. A.) 25–31 (SPIE, 1993).

  27. Barabas, J., Jolly, S., Smalley, D. E. & Bove, V. M. Jr in Practical Holography XXV: Materials and Applications Vol. 7957 (ed. Bjelkhagen, H. I.) 13–19 (SPIE, 2011).

  28. Zhang, H., Zhao, Y., Cao, L. & Jin, G. Fully computed holographic stereogram based algorithm for computer-generated holograms with accurate depth cues. Opt. Express 23, 3901–3913 (2015).

    Article  ADS  PubMed  Google Scholar 

  29. Padmanaban, N., Peng, Y. & Wetzstein, G. Holographic near-eye displays based on overlap-add stereograms. ACM Trans. Graph. 38, 214:1–214:13 (2019).

    Article  Google Scholar 

  30. Shimobaba, T., Masuda, N. & Ito, T. Simple and fast calculation algorithm for computer-generated hologram with wavefront recording plane. Opt. Lett. 34, 3133–3135 (2009).

    Article  ADS  PubMed  Google Scholar 

  31. Wakunami, K. & Yamaguchi, M. Calculation for computer generated hologram using ray-sampling plane. Opt. Express 19, 9086–9101 (2011).

    Article  ADS  PubMed  Google Scholar 

  32. Häussler, R. et al. Large real-time holographic 3Dd displays: enabling components and results. Appl. Opt. 56, F45–F52 (2017).

    Article  PubMed  Google Scholar 

  33. Hamann, S., Shi, L., Solgaard, O. & Wetzstein, G. Time-multiplexed light field synthesis via factored Wigner distribution function. Opt. Lett. 43, 599–602 (2018).

    Article  ADS  PubMed  Google Scholar 

  34. Nair, V. & Hinton, G. E. Rectified linear units improve restricted Boltzmann machines. In Proc. International Conference on International Conference on Machine Learning (ICML) 807–814 (Omnipress, 2010).

  35. Sinha, A., Lee, J., Li, S. & Barbastathis, G. Lensless computational imaging through deep learning. Optica 4, 1117–1125 (2017).

    Article  ADS  Google Scholar 

  36. Metzler, C. et al. prdeep: robust phase retrieval with a flexible deep network. In Proc. International Conference on International Conference on Machine Learning (ICML) 3501–3510 (JMLR, 2018).

  37. Eybposh, M. H., Caira, N. W., Chakravarthula, P., Atisa, M. & Pégard, N. C. in Optics and the Brain BTu2C–2 (Optical Society of America, 2020).

  38. Rivenson, Y., Zhang, Y., Günaydın, H., Teng, D. & Ozcan, A. Phase recovery and holographic image reconstruction using deep learning in neural networks. Light Sci. Appl. 7, 17141 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Ren, Z., Xu, Z. & Lam, E. Y. Learning-based nonparametric autofocusing for digital holography. Optica 5, 337–344 (2018).

    Article  ADS  Google Scholar 

  40. Wu, Y. et al. Extended depth-of-field in holographic imaging using deep-learning-based autofocusing and phase recovery. Optica 5, 704–710 (2018).

    Article  ADS  Google Scholar 

  41. Horisaki, R., Takagi, R. & Tanida, J. Deep-learning-generated holography. Appl. Opt. 57, 3859–3863 (2018).

    Article  ADS  PubMed  Google Scholar 

  42. Peng, Y., Choi, S., Padmanaban, N. & Wetzstein, G. Neural holography with camera-in-the-loop training. ACM Trans. Graph. 39, 185:1–185:14 (2020).

    Article  Google Scholar 

  43. Jiao, S. et al. Compression of phase-only holograms with JPEG standard and deep learning. Appl. Sci. 8, 1258 (2018).

    Article  Google Scholar 

  44. Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S. & Vedaldi, A. Describing textures in the wild. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 3606–3613 (IEEE, 2014).

  45. Dai, D., Riemenschneider, H. & Gool, L. V. The synthesizability of texture examples. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 3027–3034 (IEEE, 2014).

  46. Kim, C., Zimmer, H., Pritch, Y., Sorkine-Hornung, A. & Gross, M. Scene reconstruction from high spatio-angular resolution light fields. ACM Trans. Graph. 32, 73:1–73:12 (2013).

    Article  MATH  Google Scholar 

  47. Matsushima, K. & Shimobaba, T. Band-limited angular spectrum method for numerical simulation of free-space propagation in far and near fields. Opt. Express 17, 19662–19673 (2009).

    Article  ADS  CAS  PubMed  Google Scholar 

  48. Shimobaba, T. & Ito, T. A color holographic reconstruction system by time division multiplexing with reference lights of laser. Opt. Rev. 10, 339–341 (2003).

    Article  Google Scholar 

  49. Hsueh, C. K. & Sawchuk, A. A. Computer-generated double-phase holograms. Appl. Opt. 17, 3874–3883 (1978).

    Article  ADS  CAS  PubMed  Google Scholar 

  50. Mendoza-Yero, O., Mínguez-Vega, G. & Lancis, J. Encoding complex fields by using a phase-only optical element. Opt. Lett. 39, 1740–1743 (2014).

    Article  ADS  PubMed  Google Scholar 

  51. Xiao, L., Kaplanyan, A., Fix, A., Chapman, M. & Lanman, D. DeepFocus: learned image synthesis for computational displays. ACM Trans. Graph. 37, 200:1–200:13 (2018).

    Article  Google Scholar 

  52. Wang, Y., Sang, X., Chen, Z., Li, H. & Zhao, L. Real-time photorealistic computer-generated holograms based on backward ray tracing and wavefront recording planes. Opt. Commun. 429, 12–17 (2018).

    Article  ADS  CAS  Google Scholar 

  53. Hasegawa, N., Shimobaba, T., Kakue, T. & Ito, T. Acceleration of hologram generation by optimizing the arrangement of wavefront recording planes. Appl. Opt. 56, A97–A103 (2017).

    Article  ADS  Google Scholar 

  54. Sifatul Islam, M. et al. Max-depth-range technique for faster full-color hologram generation. Appl. Opt. 59, 3156–3164 (2020).

    Article  ADS  PubMed  Google Scholar 

  55. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In International Conference on Learning Representations (ICLR) (2015).

  56. Ronneberger, O., Fischer, P. & Brox, T. U-net: convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI) 234–241 (Springer, 2015).

  57. Yu, F., Koltun, V. & Funkhouser, T. Dilated residual networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 472–480 (IEEE, 2017).

Download references

Acknowledgements

We thank K. Aoyama and S. Wen (from Sony) for discussions; J. Minor, T. Du, M. Foshey, L. Makatura, W. Shou and T. Erps from MIT for improving/editing the manuscript; R. White for the administration of the project; X. Ju for the design of iPhone demo; and P. Ma for providing an iPhone 11 Pro for the mobile demo. We acknowledge funding from Sony Research Award Program.

Author information

Authors and Affiliations

Authors

Contributions

L.S. conceived the idea, implemented the proposed framework, built the display prototype, performed experimental validation, and conducted the iPhone and Edge TPU demo. B.L. performed the pipeline evaluation and made the Supplementary Videos. B.L., C.K. and P.K. were involved in the design of the proposed framework. L.S. and P.K. led the writing and revision of the manuscript. W.M. supervised the work. All authors discussed ideas and results, and contributed to the manuscript.

Corresponding authors

Correspondence to Liang Shi or Wojciech Matusik.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Tomoyoshi Shimobaba and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Visualization of masked Fresnel zone plates computed by OA-PBM and performance comparison of foreground occlusion.

a, A depth image cropped from a frame of Big Buck Bunny. Three regions with different depth landscapes are highlighted in different colours. b, Masked Fresnel zone plates computed for the centre pixel of each highlighted region. Three pixels are propagated for the same distance for ease of comparison. The flat depth landscape around the green pixel results in a non-occluded Fresnel zone plate. The masked Fresnel zone plates of red and blue pixels contain sharp cutoffs at their long-distance separated occlusion boundaries, and freeform shapes at occlusion boundaries with moderate distance separation and varying depth distribution. c, Comparison of foreground reconstruction by the PBM, OA-PBM and Fresnel diffraction. The scene is a cropped modulation transfer function bar target with a step depth profile. The PBM leaks a considerable portion of the background into the foreground due to a lack of occlusion handling. The artefacts are clearly visible in the original unmagnified view. The OA-PBM removes a considerable portion of the artefacts and the remaining artefacts are visually inconsequential in the unmagnified view. d, Comparison of focal stacks reconstructed by the PBM and OA-PBM for the Big Buck Bunny. The orange bounding boxes mark the background leakage in the PBM reconstructions. a, d, Images reproduced from www.bigbuckbunny.org (© 2008, Blender Foundation) under a Creative Commons licence (https://creativecommons.org/licenses/by/3.0/).

Extended Data Fig. 2 Samples of the MIT-CGH-4K dataset and comparison with the DeepFocus dataset.

a, The RGB-D image, amplitude and phase of two samples from the MIT-CGH-4K dataset. The RGB image records the amplitude of the scene (directly visualized in sRGB space) and consists of large variations in colour, texture, shading and occlusion. The pixel depth has a statistically uniform distribution throughout the view frustum. The phase presents high-frequency features at both occlusion boundaries and texture edges to accommodate rapid depth and colour changes. b, A sample RGB-D image from the DeepFocus dataset51. c, Histograms of pixel depth distribution computed for the MIT-CGH-4K dataset and the DeepFocus dataset. b, Image reproduced from ‘3D Scans from Louvre Museum’ by Benjamin Bardou under a Creative Commons licence (https://creativecommons.org/licenses/by-nc/4.0/).

Extended Data Fig. 3 Schematic of the midpoint hologram calculation.

a, A holographic display magnified through a diverging point light source. b, A holographic display unmagnified through the thin-lens formula. c, The target hologram in this example is propagated to the centre of the unmagnified view frustum to produce the midpoint hologram. The width of the maximum subhologram is considerably reduced.

Extended Data Fig. 4 Evaluation of tensor holography CNN on model architecture and test patterns.

a, Performance comparison of different CNN architectures. b, Performance comparison of different CNN miniaturization methods. c, CNN prediction of two standard test pattern (USAF-1951 and RCA Indian-head) variants made by the authors.

Extended Data Fig. 5 Evaluation of tensor holography CNN on additional computer-rendered scenes.

a, b, CNN prediction of amplitude and phase along with focused reconstructions for holograms of a living room scene from the DeepFocus dataset51 (a) and a night landscape scene from the Stanford light field dataset29 (b). a, Certain still images from ‘ArchVizPRO Vol. 2’ were used to render new images for inclusion in this publication with the permission of the copyright holder (© Corridori Ruggero 2018), under a Creative Commons licence (https://creativecommons.org/licenses/by-nc/4.0/). Panel b reproduced with permission from ref. 29, ACM.

Extended Data Fig. 6 Evaluation of tensor holography CNN on real-world captured scenes.

a, b, CNN prediction of amplitude and phase along with focused reconstructions for holograms of a statue scene (a) and a mansion scene (b). Both scenes are from the ETH light field dataset46.

Extended Data Fig. 7 Comparison of the original DPM and the AA-DPM.

Reconstruction of two real-world scenes from the encoded phase-only holograms. The couch scene is focused on the mouse toy and the statue scene is focused on the black statue. Orange bounding boxes highlight regions with strong high-frequency artefacts. Left: DPM. Right: AA-DPM.

Extended Data Fig. 8 Holographic display prototype used for the experimental results shown in this paper.

The control box of the laser, Labjack DAQ and camera are not visualized in the figure.

Extended Data Fig. 9 Additional experimental demonstration of 3D holographic projection (part 1).

The RGB-D input can be found in Extended Data Fig. 6.

Extended Data Fig. 10 Additional experimental demonstration of 3D holographic projection (part 2).

The RGB-D inputs can be found in Extended Data Fig. 6 for a, and Extended Data Fig. 4 for b. Panel a reproduced with permission from ref. 29, ACM.

Supplementary information

Video 1

This video demonstrates a simulated focal sweep of a CNN predicted hologram computed for a real-world captured 3D couch scene. The image resolution is 1080p.

Video 2

This video demonstrates a simulated focal sweep of a CNN predicted hologram computed for a computer-rendered 3D living room scene. The image resolution is 1024*1024.

Video 3

This video demonstrates a photographed focal sweep of a CNN predicted hologram computed for a real-world captured 3D couch scene. The video is captured by a Sony A7 Mark III mirrorless camera paired with a Sony GM 16-35mm/f2.8 camera lens at 4K/30 Hz and downsampled to 1080p. Only green channel is visualized for temporal stability.

Video 4

This video demonstrates real-time 3D hologram computation on a NVIDIA TITAN RTX GPU. The video is captured by a Panasonic GH5 mirrorless camera with a Lumix 10-25 mm f/1.7 lens at 4K/60 Hz (a colour frame rate of 20 Hz) and downsampled to 1080P. The color is obtained field sequentially.

Video 5

This video demonstrates interactive hologram computation on an iPhone 11 Pro using a mini version of tensor holography CNN (see Fig. 2 caption for network architecture details).

Video 6

This video demonstrates a simulated focal sweep of a CNN predicted hologram computed for a 3D Star test pattern. The image resolution is 1550*1462.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, L., Li, B., Kim, C. et al. Towards real-time photorealistic 3D holography with deep neural networks. Nature 591, 234–239 (2021). https://doi.org/10.1038/s41586-020-03152-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-020-03152-0

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing