Skip to main content

BIRNAT: Bidirectional Recurrent Neural Networks with Adversarial Training for Video Snapshot Compressive Imaging

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Abstract

We consider the problem of video snapshot compressive imaging (SCI), where multiple high-speed frames are coded by different masks and then summed to a single measurement. This measurement and the modulation masks are fed into our Recurrent Neural Network (RNN) to reconstruct the desired high-speed frames. Our end-to-end sampling and reconstruction system is dubbed BIdirectional Recurrent Neural networks with Adversarial Training (BIRNAT). To our best knowledge, this is the first time that recurrent networks are employed to SCI problem. Our proposed BIRNAT outperforms other deep learning based algorithms and the state-of-the-art optimization based algorithm, DeSCI, through exploiting the underlying correlation of sequential video frames. BIRNAT employs a deep convolutional neural network with Resblock and feature map self-attention to reconstruct the first frame, based on which bidirectional RNN is utilized to reconstruct the following frames in a sequential manner. To improve the quality of the reconstructed video, BIRNAT is further equipped with the adversarial training besides the mean square error loss. Extensive results on both simulation and real data (from two SCI cameras) demonstrate the superior performance of our BIRNAT system. The codes are available at https://github.com/BoChenGroup/BIRNAT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Barbastathis, G., Ozcan, A., Situ, G.: On the use of deep learning for computational imaging. Optica 6(8), 921–943 (2019)

    Article  Google Scholar 

  2. Bioucas-Dias, J., Figueiredo, M.: A new TwIST: Two-step iterative shrinkage/thresholding algorithms for image restoration. IEEE Trans. Image Process. 16(12), 2992–3004 (2007)

    Article  MathSciNet  Google Scholar 

  3. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)

    Article  Google Scholar 

  4. Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 60–65. IEEE (2005)

    Google Scholar 

  5. Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theor. 52(4), 1289–1306 (2006)

    Article  MathSciNet  Google Scholar 

  6. Emmanuel, C., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theor. 52(2), 489–509 (2006)

    Article  MathSciNet  Google Scholar 

  7. Goodfellow, I.: Nips 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160 (2016)

  8. Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS 2014, vol. 2, pp. 2672–2680 (2014)

    Google Scholar 

  9. Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649 (May 2013). https://doi.org/10.1109/ICASSP.2013.6638947

  10. Gu, S., Zhang, L., Zuo, W., Feng, X.: Weighted nuclear norm minimization with application to image denoising. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2862–2869 (2014)

    Google Scholar 

  11. Haris, M., Shakhnarovich, G., Ukita, N.: Recurrent back-projection network for video super-resolution. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., J, S.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  13. Hitomi, Y., Gu, J., Gupta, M., Mitsunaga, T., Nayar, S.K.: Video from a single coded exposure photograph using a learned over-complete dictionary. In: 2011 International Conference on Computer Vision, pp. 287–294. IEEE (2011)

    Google Scholar 

  14. Huang, Y., Wang, W., Wang, L.: Video super-resolution via bidirectional recurrent convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 1015–1028 (2018). https://doi.org/10.1109/TPAMI.2017.2701380

    Article  Google Scholar 

  15. Iliadis, M., Spinoulas, L., Katsaggelos, A.K.: Deep fully-connected networks for video compressive sensing. Digit. Sig. Proc. 72, 9–18 (2018). https://doi.org/10.1016/j.dsp.2017.09.010

    Article  Google Scholar 

  16. Yang, J., et al.: Video compressive sensing using Gaussian mixture models. IEEE Trans. Image Process. 23(11), 4863–4878 (2014)

    Article  MathSciNet  Google Scholar 

  17. Jaeger, H.: A tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the “echo state network” approach (2005)

    Google Scholar 

  18. Jalali, S., Yuan, X.: Snapshot compressed sensing: performance bounds and algorithms. IEEE Trans. Inf. Theor. 65(12), 8005–8024 (2019). https://doi.org/10.1109/TIT.2019.2940666

    Article  MathSciNet  MATH  Google Scholar 

  19. Jalali, S., Yuan, X.: Compressive imaging via one-shot measurements. In: IEEE International Symposium on Information Theory (ISIT) (2018)

    Google Scholar 

  20. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

    Google Scholar 

  21. Kulkarni, K., Lohit, S., Turaga, P., Kerviche, R., Ashok, A.: ReconNet: non-iterative reconstruction of images from compressively sensed random measurements. In: CVPR (2016)

    Google Scholar 

  22. Liu, Y., Yuan, X., Suo, J., Brady, D., Dai, Q.: Rank minimization for snapshot compressive imaging. IEEE Trans. Pattern Anal. Mach. Intell. 41(12), 2990–3006 (2019)

    Article  Google Scholar 

  23. Llull, P., et al.: Coded aperture compressive temporal imaging. Opt. Exp. 21(9), 10526–10545 (2013). https://doi.org/10.1364/OE.21.010526

    Article  Google Scholar 

  24. Llull, P., Yuan, X., Carin, L., Brady, D.J.: Image translation for single-shot focal tomography. Optica 2(9), 822–825 (2015)

    Article  Google Scholar 

  25. Lu, G., Ouyang, W., Xu, D., Zhang, X., Cai, C., Gao, Z.: DVC: an end-to-end deep video compression framework. In: CVPR (2019)

    Google Scholar 

  26. Ma, J., Liu, X., Shou, Z., Yuan, X.: Deep tensor ADMM-Net for snapshot compressive imaging. In: IEEE/CVF Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  27. Meng, Z., Ma, J., Yuan, X.: End-to-end low cost compressive spectral imaging with spatial-spectral self-attention. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) ECCV 2020. LNCS, vol. 12368, pp. 187–204. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58592-1_12

    Chapter  Google Scholar 

  28. Meng, Z., Qiao, M., Ma, J., Yu, Z., Xu, K., Yuan, X.: Snapshot multispectral endomicroscopy. Opt. Lett. 45(14), 3897–3900 (2020)

    Article  Google Scholar 

  29. Mescheder, L., Nowozin, S., Geiger, A.: Which training methods for GANs do actually converge? In: International Conference on Machine Learning (ICML) (2018)

    Google Scholar 

  30. Miao, X., Yuan, X., Pu, Y., Athitsos, V.: \(\lambda \)-Net: reconstruct hyperspectral images from a snapshot measurement. In: IEEE/CVF Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  31. Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: INTERSPEECH, vol. 2, p. 3 (2010)

    Google Scholar 

  32. Nah, S., Son, S., Lee, K.M.: Recurrent neural networks with intra-frame iterations for video deblurring. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)

    Google Scholar 

  33. Pont-Tuset, J., Perazzi, F., Caelles, S., Arbelaez, P., Sorkine-Hornung, A., Gool, L.V.: The 2017 DAVIS challenge on video object segmentation. CoRR abs/1704.00675 (2017). http://arxiv.org/abs/1704.00675

  34. Qiao, M., Liu, X., Yuan, X.: Snapshot spatial-temporal compressive imaging. Opt. Lett. 45(7), 1659–1662 (2020)

    Article  Google Scholar 

  35. Qiao, M., Meng, Z., Ma, J., Yuan, X.: Deep learning for video compressive sensing. APL Photonics 5(3), 030801 (2020). https://doi.org/10.1063/1.5140721

    Article  Google Scholar 

  36. Reddy, D., Veeraraghavan, A., Chellappa, R.: P2c2: programmable pixel compressive camera for high speed imaging. In: CVPR 2011, pp. 329–336. IEEE (2011)

    Google Scholar 

  37. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  38. Roux, J.R.L., Weninger, J.: Deep unfolding: Model-based inspiration of novel deep architectures (2014)

    Google Scholar 

  39. Sun, Y., Yuan, X., Pang, S.: High-speed compressive range imaging based on active illumination. Opt. Exp. 24(20), 22836–22846 (2016)

    Article  Google Scholar 

  40. Sun, Y., Yuan, X., Pang, S.: Compressive high-speed stereo imaging. Opt. Exp. 25(15), 18182–18190 (2017). https://doi.org/10.1364/OE.25.018182

    Article  Google Scholar 

  41. Tsai, T.H., Llull, P., Yuan, X., Carin, L., Brady, D.J.: Spectral-temporal compressive imaging. Opt. Lett. 40(17), 4054–4057 (2015)

    Article  Google Scholar 

  42. Tsai, T.H., Yuan, X., Brady, D.J.: Spatial light modulator based color polarization imaging. Opt. Exp. 23(9), 11912–11926 (2015)

    Article  Google Scholar 

  43. Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf

  44. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  45. Ventura, C., Bellver, M., Girbau, A., Salvador, A., Marques, F., Giro-i Nieto, X.: RVOS: end-to-end recurrent network for video object segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)

    Google Scholar 

  46. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  47. Xie, J., Xu, L., Chen, E.: Image denoising and inpainting with deep neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 341–349. Curran Associates, Inc. (2012). http://papers.nips.cc/paper/4686-image-denoising-and-inpainting-with-deep-neural-networks.pdf

  48. Xu, K., Ren, F.: CSVideoNet: A real-time end-to-end learning framework for high-frame-rate video compressive sensing. arXiv: 1612.05203 (December 2016)

  49. Yang, J., Liao, X., Yuan, X., Llull, P., Brady, D.J., Sapiro, G., Carin, L.: Compressive sensing by learning a Gaussian mixture model from measurements. IEEE Trans. Image Process. 24(1), 106–119 (2015)

    Article  MathSciNet  Google Scholar 

  50. Yang, Y., Sun, J., Li, H., Xu, Z.: Deep ADMM-Net for compressive sensing MRI. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 10–18. Curran Associates, Inc. (2016)

    Google Scholar 

  51. Yoshida, M., et al.: Joint optimization for compressive video sensing and reconstruction under hardware constraints. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 649–663. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_39

    Chapter  Google Scholar 

  52. Yuan, X.: Generalized alternating projection based total variation minimization for compressive sensing. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 2539–2543 (September 2016)

    Google Scholar 

  53. Yuan, X., Brady, D., Katsaggelos, A.K.: Snapshot compressive imaging: Theory, algorithms and applications. IEEE Sig. Process. Mag. (2020)

    Google Scholar 

  54. Yuan, X., Liu, Y., Suo, J., Dai, Q.: Plug-and-play algorithms for large-scale snapshot compressive imaging. In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2020)

    Google Scholar 

  55. Yuan, X., et al.: Low-cost compressive sensing for color video and depth. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3318–3325 (2014). https://doi.org/10.1109/CVPR.2014.424

  56. Yuan, X., Pang, S.: Structured illumination temporal compressive microscopy. Biomed. Opt. Exp. 7, 746–758 (2016)

    Article  Google Scholar 

  57. Yuan, X., Pu, Y.: Parallel lensless compressive imaging via deep convolutional neural networks. Opt. Exp. 26(2), 1962–1977 (2018)

    Article  Google Scholar 

  58. Yuan, X., Tsai, T.H., Zhu, R., Llull, P., Brady, D., Carin, L.: Compressive hyperspectral imaging with side information. IEEE J. Sel. Top. Sig. Process. 9(6), 964–976 (2015)

    Article  Google Scholar 

  59. Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017). https://doi.org/10.1109/TIP.2017.2662206

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgement

B. Chen acknowledges the support of the Program for Oversea Talent by Chinese Central Government, the 111 Project (No. B18039), and NSFC (61771361) and Shaanxi Innovation Team Project.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Bo Chen or Xin Yuan .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 2 (mp4 2938 KB)

Supplementary material 3 (mp4 2887 KB)

Supplementary material 4 (mp4 429 KB)

Supplementary material 5 (mp4 3387 KB)

Supplementary material 6 (mp4 518 KB)

Supplementary material 7 (mp4 6889 KB)

Supplementary material 1 (pdf 41027 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cheng, Z. et al. (2020). BIRNAT: Bidirectional Recurrent Neural Networks with Adversarial Training for Video Snapshot Compressive Imaging. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12369. Springer, Cham. https://doi.org/10.1007/978-3-030-58586-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58586-0_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58585-3

  • Online ISBN: 978-3-030-58586-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics