BIRNAT: Bidirectional Recurrent Neural Networks with Adversarial Training for Video Snapshot Compressive Imaging

Cheng, Ziheng; Lu, Ruiying; Wang, Zhengjue; Zhang, Hao; Chen, Bo; Meng, Ziyi; Yuan, Xin

doi:10.1007/978-3-030-58586-0_16

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12369))

Included in the following conference series:

European Conference on Computer Vision

3942 Accesses
32 Citations

Abstract

We consider the problem of video snapshot compressive imaging (SCI), where multiple high-speed frames are coded by different masks and then summed to a single measurement. This measurement and the modulation masks are fed into our Recurrent Neural Network (RNN) to reconstruct the desired high-speed frames. Our end-to-end sampling and reconstruction system is dubbed BIdirectional Recurrent Neural networks with Adversarial Training (BIRNAT). To our best knowledge, this is the first time that recurrent networks are employed to SCI problem. Our proposed BIRNAT outperforms other deep learning based algorithms and the state-of-the-art optimization based algorithm, DeSCI, through exploiting the underlying correlation of sequential video frames. BIRNAT employs a deep convolutional neural network with Resblock and feature map self-attention to reconstruct the first frame, based on which bidirectional RNN is utilized to reconstruct the following frames in a sequential manner. To improve the quality of the reconstructed video, BIRNAT is further equipped with the adversarial training besides the mean square error loss. Extensive results on both simulation and real data (from two SCI cameras) demonstrate the superior performance of our BIRNAT system. The codes are available at https://github.com/BoChenGroup/BIRNAT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barbastathis, G., Ozcan, A., Situ, G.: On the use of deep learning for computational imaging. Optica 6(8), 921–943 (2019)
Article Google Scholar
Bioucas-Dias, J., Figueiredo, M.: A new TwIST: Two-step iterative shrinkage/thresholding algorithms for image restoration. IEEE Trans. Image Process. 16(12), 2992–3004 (2007)
Article MathSciNet Google Scholar
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)
Article Google Scholar
Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 60–65. IEEE (2005)
Google Scholar
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theor. 52(4), 1289–1306 (2006)
Article MathSciNet Google Scholar
Emmanuel, C., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theor. 52(2), 489–509 (2006)
Article MathSciNet Google Scholar
Goodfellow, I.: Nips 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160 (2016)
Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS 2014, vol. 2, pp. 2672–2680 (2014)
Google Scholar
Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649 (May 2013). https://doi.org/10.1109/ICASSP.2013.6638947
Gu, S., Zhang, L., Zuo, W., Feng, X.: Weighted nuclear norm minimization with application to image denoising. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2862–2869 (2014)
Google Scholar
Haris, M., Shakhnarovich, G., Ukita, N.: Recurrent back-projection network for video super-resolution. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)
Google Scholar
He, K., Zhang, X., Ren, S., J, S.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Hitomi, Y., Gu, J., Gupta, M., Mitsunaga, T., Nayar, S.K.: Video from a single coded exposure photograph using a learned over-complete dictionary. In: 2011 International Conference on Computer Vision, pp. 287–294. IEEE (2011)
Google Scholar
Huang, Y., Wang, W., Wang, L.: Video super-resolution via bidirectional recurrent convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 1015–1028 (2018). https://doi.org/10.1109/TPAMI.2017.2701380
Article Google Scholar
Iliadis, M., Spinoulas, L., Katsaggelos, A.K.: Deep fully-connected networks for video compressive sensing. Digit. Sig. Proc. 72, 9–18 (2018). https://doi.org/10.1016/j.dsp.2017.09.010
Article Google Scholar
Yang, J., et al.: Video compressive sensing using Gaussian mixture models. IEEE Trans. Image Process. 23(11), 4863–4878 (2014)
Article MathSciNet Google Scholar
Jaeger, H.: A tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the “echo state network” approach (2005)
Google Scholar
Jalali, S., Yuan, X.: Snapshot compressed sensing: performance bounds and algorithms. IEEE Trans. Inf. Theor. 65(12), 8005–8024 (2019). https://doi.org/10.1109/TIT.2019.2940666
Article MathSciNet MATH Google Scholar
Jalali, S., Yuan, X.: Compressive imaging via one-shot measurements. In: IEEE International Symposium on Information Theory (ISIT) (2018)
Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Google Scholar
Kulkarni, K., Lohit, S., Turaga, P., Kerviche, R., Ashok, A.: ReconNet: non-iterative reconstruction of images from compressively sensed random measurements. In: CVPR (2016)
Google Scholar
Liu, Y., Yuan, X., Suo, J., Brady, D., Dai, Q.: Rank minimization for snapshot compressive imaging. IEEE Trans. Pattern Anal. Mach. Intell. 41(12), 2990–3006 (2019)
Article Google Scholar
Llull, P., et al.: Coded aperture compressive temporal imaging. Opt. Exp. 21(9), 10526–10545 (2013). https://doi.org/10.1364/OE.21.010526
Article Google Scholar
Llull, P., Yuan, X., Carin, L., Brady, D.J.: Image translation for single-shot focal tomography. Optica 2(9), 822–825 (2015)
Article Google Scholar
Lu, G., Ouyang, W., Xu, D., Zhang, X., Cai, C., Gao, Z.: DVC: an end-to-end deep video compression framework. In: CVPR (2019)
Google Scholar
Ma, J., Liu, X., Shou, Z., Yuan, X.: Deep tensor ADMM-Net for snapshot compressive imaging. In: IEEE/CVF Conference on Computer Vision (ICCV) (2019)
Google Scholar
Meng, Z., Ma, J., Yuan, X.: End-to-end low cost compressive spectral imaging with spatial-spectral self-attention. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) ECCV 2020. LNCS, vol. 12368, pp. 187–204. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58592-1_12
Chapter Google Scholar
Meng, Z., Qiao, M., Ma, J., Yu, Z., Xu, K., Yuan, X.: Snapshot multispectral endomicroscopy. Opt. Lett. 45(14), 3897–3900 (2020)
Article Google Scholar
Mescheder, L., Nowozin, S., Geiger, A.: Which training methods for GANs do actually converge? In: International Conference on Machine Learning (ICML) (2018)
Google Scholar
Miao, X., Yuan, X., Pu, Y., Athitsos, V.: \(\lambda \)-Net: reconstruct hyperspectral images from a snapshot measurement. In: IEEE/CVF Conference on Computer Vision (ICCV) (2019)
Google Scholar
Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: INTERSPEECH, vol. 2, p. 3 (2010)
Google Scholar
Nah, S., Son, S., Lee, K.M.: Recurrent neural networks with intra-frame iterations for video deblurring. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)
Google Scholar
Pont-Tuset, J., Perazzi, F., Caelles, S., Arbelaez, P., Sorkine-Hornung, A., Gool, L.V.: The 2017 DAVIS challenge on video object segmentation. CoRR abs/1704.00675 (2017). http://arxiv.org/abs/1704.00675
Qiao, M., Liu, X., Yuan, X.: Snapshot spatial-temporal compressive imaging. Opt. Lett. 45(7), 1659–1662 (2020)
Article Google Scholar
Qiao, M., Meng, Z., Ma, J., Yuan, X.: Deep learning for video compressive sensing. APL Photonics 5(3), 030801 (2020). https://doi.org/10.1063/1.5140721
Article Google Scholar
Reddy, D., Veeraraghavan, A., Chellappa, R.: P2c2: programmable pixel compressive camera for high speed imaging. In: CVPR 2011, pp. 329–336. IEEE (2011)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Roux, J.R.L., Weninger, J.: Deep unfolding: Model-based inspiration of novel deep architectures (2014)
Google Scholar
Sun, Y., Yuan, X., Pang, S.: High-speed compressive range imaging based on active illumination. Opt. Exp. 24(20), 22836–22846 (2016)
Article Google Scholar
Sun, Y., Yuan, X., Pang, S.: Compressive high-speed stereo imaging. Opt. Exp. 25(15), 18182–18190 (2017). https://doi.org/10.1364/OE.25.018182
Article Google Scholar
Tsai, T.H., Llull, P., Yuan, X., Carin, L., Brady, D.J.: Spectral-temporal compressive imaging. Opt. Lett. 40(17), 4054–4057 (2015)
Article Google Scholar
Tsai, T.H., Yuan, X., Brady, D.J.: Spatial light modulator based color polarization imaging. Opt. Exp. 23(9), 11912–11926 (2015)
Article Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Ventura, C., Bellver, M., Girbau, A., Salvador, A., Marques, F., Giro-i Nieto, X.: RVOS: end-to-end recurrent network for video object segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)
Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Xie, J., Xu, L., Chen, E.: Image denoising and inpainting with deep neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 341–349. Curran Associates, Inc. (2012). http://papers.nips.cc/paper/4686-image-denoising-and-inpainting-with-deep-neural-networks.pdf
Xu, K., Ren, F.: CSVideoNet: A real-time end-to-end learning framework for high-frame-rate video compressive sensing. arXiv: 1612.05203 (December 2016)
Yang, J., Liao, X., Yuan, X., Llull, P., Brady, D.J., Sapiro, G., Carin, L.: Compressive sensing by learning a Gaussian mixture model from measurements. IEEE Trans. Image Process. 24(1), 106–119 (2015)
Article MathSciNet Google Scholar
Yang, Y., Sun, J., Li, H., Xu, Z.: Deep ADMM-Net for compressive sensing MRI. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 10–18. Curran Associates, Inc. (2016)
Google Scholar
Yoshida, M., et al.: Joint optimization for compressive video sensing and reconstruction under hardware constraints. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 649–663. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_39
Chapter Google Scholar
Yuan, X.: Generalized alternating projection based total variation minimization for compressive sensing. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 2539–2543 (September 2016)
Google Scholar
Yuan, X., Brady, D., Katsaggelos, A.K.: Snapshot compressive imaging: Theory, algorithms and applications. IEEE Sig. Process. Mag. (2020)
Google Scholar
Yuan, X., Liu, Y., Suo, J., Dai, Q.: Plug-and-play algorithms for large-scale snapshot compressive imaging. In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2020)
Google Scholar
Yuan, X., et al.: Low-cost compressive sensing for color video and depth. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3318–3325 (2014). https://doi.org/10.1109/CVPR.2014.424
Yuan, X., Pang, S.: Structured illumination temporal compressive microscopy. Biomed. Opt. Exp. 7, 746–758 (2016)
Article Google Scholar
Yuan, X., Pu, Y.: Parallel lensless compressive imaging via deep convolutional neural networks. Opt. Exp. 26(2), 1962–1977 (2018)
Article Google Scholar
Yuan, X., Tsai, T.H., Zhu, R., Llull, P., Brady, D., Carin, L.: Compressive hyperspectral imaging with side information. IEEE J. Sel. Top. Sig. Process. 9(6), 964–976 (2015)
Article Google Scholar
Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017). https://doi.org/10.1109/TIP.2017.2662206
Article MathSciNet MATH Google Scholar

Download references

Acknowledgement

B. Chen acknowledges the support of the Program for Oversea Talent by Chinese Central Government, the 111 Project (No. B18039), and NSFC (61771361) and Shaanxi Innovation Team Project.

Author information

Authors and Affiliations

National Laboratory of Radar Signal Processing, Xidian University, Xi’an, China
Ziheng Cheng, Ruiying Lu, Zhengjue Wang, Hao Zhang & Bo Chen
Beijing University of Posts and Telecommunications, Beijing, China
Ziyi Meng
New Jersey Institute of Technology, Newark, NJ, USA
Ziyi Meng
Nokia Bell Labs, Murray Hill, NJ, USA
Xin Yuan

Authors

Ziheng Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Ruiying Lu
View author publications
You can also search for this author in PubMed Google Scholar
Zhengjue Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bo Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ziyi Meng
View author publications
You can also search for this author in PubMed Google Scholar
Xin Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Bo Chen or Xin Yuan .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 2 (mp4 2938 KB)

Supplementary material 3 (mp4 2887 KB)

Supplementary material 4 (mp4 429 KB)

Supplementary material 5 (mp4 3387 KB)

Supplementary material 6 (mp4 518 KB)

Supplementary material 7 (mp4 6889 KB)

Supplementary material 1 (pdf 41027 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheng, Z. et al. (2020). BIRNAT: Bidirectional Recurrent Neural Networks with Adversarial Training for Video Snapshot Compressive Imaging. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12369. Springer, Cham. https://doi.org/10.1007/978-3-030-58586-0_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-58586-0_16
Published: 30 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58585-3
Online ISBN: 978-3-030-58586-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics