Skip to main content
Log in

Unsupervised deep context prediction for background estimation and foreground segmentation

  • Original paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Background estimation is a fundamental step in many high-level vision applications, such as tracking and surveillance. Existing background estimation techniques suffer from performance degradation in the presence of challenges such as dynamic backgrounds, photometric variations, camera jitters, and shadows. To handle these challenges for the purpose of accurate background estimation, we propose a unified method based on Generative Adversarial Network (GAN) and image inpainting. The proposed method is based on a context prediction network, which is an unsupervised visual feature learning hybrid GAN model. Context prediction is followed by a semantic inpainting network for texture enhancement. We also propose a solution for arbitrary region inpainting using the center region inpainting method and Poisson blending technique. The proposed algorithm is compared with the existing state-of-the-art methods for background estimation and foreground segmentation and outperforms the compared methods by a significant margin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. http://scenebackgroundmodeling.net/.

  2. http://jacarini.dinf.usherbrooke.ca/datasetOverview/.

References

  1. Afifi, M., Hussain, K.F.: Mpb: a modified poisson blending technique. Comput. Vis. Media 1(4), 331–341 (2015)

    Article  Google Scholar 

  2. Bengio, Y., et al.: Learning deep architectures for ai. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bouwmans, T., Zahzah, E.H.: Robust pca via principal component pursuit: a review for a comparative evaluation in video surveillance. Comput. Vis. Image Underst. 122, 22–34 (2014)

    Article  Google Scholar 

  4. Bouwmans, T., Maddalena, L., Petrosino, A.: Scene background initialization: a taxonomy. Pattern Recognit. Lett. 96, 3–11 (2017)

    Article  Google Scholar 

  5. Bouwmans, T., Javed, S., Zhang, H., Lin, Z., Otazo, R.: On the applications of robust pca in image and video processing. Proc. IEEE 106(8), 1427–1457 (2018)

    Article  Google Scholar 

  6. Braham, M., Van Droogenbroeck, M.: Deep background subtraction with scene-specific convolutional neural networks. In: 2016 International Conference on Systems, Signals and Image Processing (IWSSIP), pp. 1–4. IEEE (2016)

  7. Candès, E.J., Li, X., Ma, Y., Wright, J.: Robust principal component analysis? J. ACM 58(3), 11 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  8. Cao, X., Yang, L., Guo, X.: Total variation regularized rpca for irregularly moving object detection under dynamic background. IEEE Trans. Cybern. 46(4), 1014–1027 (2016)

    Article  Google Scholar 

  9. Chen, M., Wei, X., Yang, Q., Li, Q., Wang, G., Yang, M.H.: Spatiotemporal GMM for background subtraction with superpixel hierarchy. IEEE Trans. Pattern Anal. Mach. Intell. 40, 1518–1525 (2017)

    Article  Google Scholar 

  10. Colombari, A., Cristani, M., Murino, V., Fusiello, A.: Exemplar-based background model initialization. In: Proceedings of the third ACM International Workshop on Video Surveillance & Sensor Networks, pp. 29–36. ACM (2005)

  11. Dong, X., Shen, J., Yu, D., Wang, W., Liu, J., Huang, H.: Occlusion-aware real-time object tracking. IEEE Trans. Multimed. 19(4), 763–771 (2017)

    Article  Google Scholar 

  12. Dong, X., Shen, J., Wang, W., Liu, Y., Shao, L., Porikli, F.: Hyperparameter optimization for tracking with continuous deep q-learning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

  13. Elgammal, A., Harwood, D., Davis, L.: Non-parametric model for background subtraction. In: European Conference on Computer Vision, pp. 751–767. Springer, Berlin (2000)

  14. Erichson, N.B., Donovan, C.: Randomized low-rank dynamic mode decomposition for motion detection. Comput. Vis. Image Underst. 146, 40–50 (2016)

    Article  Google Scholar 

  15. Fu, H., Cao, X., Tu, Z.: Cluster-based co-saliency detection. IEEE Trans. Image Process. 22(10), 3766–3778 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  16. Gao, Z., Cheong, L.F., Wang, Y.X.: Block-sparse RPCA for salient motion detection. IEEE T-PAMI 36(10), 1975–1987 (2014)

    Article  Google Scholar 

  17. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2016)

    Article  Google Scholar 

  18. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

  19. Guo, X., Wang, X., Yang, L., Cao, X., Ma, Y.: Robust foreground detection using smoothness and arbitrariness constraints. In: European Conference on Computer Vision, pp 535–550. Springer, Berlin (2014)

  20. Haines, T.S., Xiang, T.: Background subtraction with dirichletprocess mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 670–683 (2014)

    Article  Google Scholar 

  21. Han, J., Cheng, G., Li, Z., Zhang, D.: A unified metric learning-based framework for co-saliency detection. In: IEEE Transactions on Circuits and Systems for Video Technology (2017)

  22. Han, J., Quan, R., Zhang, D., Nie, F.: Robust object co-segmentation using background prior. IEEE Trans. Image Process. 27(4), 1639–1651 (2018a)

    Article  MathSciNet  MATH  Google Scholar 

  23. Han, J., Zhang, D., Cheng, G., Liu, N., Xu, D.: Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Process. Mag. 35(1), 84–100 (2018b)

    Article  Google Scholar 

  24. He, J., Balzano, L., Szlam, A.: Incremental gradient on the grassmannian for online foreground and background separation in subsampled video. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1568–1575. IEEE (2012)

  25. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  26. Javed, S., Oh, S.H., Bouwmans, T., Jung, S.K.: Robust background subtraction to global illumination changes via multiple features-based online robust principal components analysis with markov random field. J. Electron. Imaging 24(4), 043011 (2015)

    Article  Google Scholar 

  27. Javed, S., Jung, S.K., Mahmood, A., Bouwmans, T.: Motion-aware graph regularized RPCA for background modeling of complex scenes. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 120–125. IEEE (2016)

  28. Javed, S., Mahmood, A., Bouwmans, T., Jung, S.K.: Spatiotemporal low-rank modeling for complex scene background initialization. In: IEEE Transactions on Circuits and Systems for Video Technology (2016)

  29. Javed, S., Mahmood, A., Bouwmans, T., Jung, S.K.: Background-foreground modeling based on spatiotemporal sparse subspace clustering. IEEE Trans. Image Process. 26(12), 5840–5854 (2017a)

    Article  MathSciNet  MATH  Google Scholar 

  30. Javed, S., Mahmood, A., Bouwmans, T., Jung, S.K.: Background-Foreground Modeling Based on Spatiotemporal Sparse Subspace Clustering. IEEE T-IP (2017)

  31. Javed, S., Mahmood, A., Al-Maadeed, S., Bouwmans, T., Jung, S.K.: Moving object detection in complex scene using spatiotemporal structured-sparse RPCA. IEEE Trans. Image Process. 28, 1007–1022 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  32. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Bartlett, P.L, Pereira, F.C.N, Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, pp. 1097–1105. Neural Information Processing Systems Conference (2012)

  33. Kwok, T.H., Sheung, H., Wang, C.C.: Fast query for exemplar-based image completion. IEEE Trans. Image Process. 19(12), 3106–3115 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  34. Li, X., Zhao, B., Lu, X.: A general framework for edited video and raw video summarization. IEEE Trans. Image Process. 26(8), 3652–3664 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  35. Li, X., Zhao, B., Lu, X.: Key frame extraction in the summary space. IEEE Trans. Cybern. 48(6), 1923–1934 (2018)

    Article  Google Scholar 

  36. Liang, D., Hashimoto, M., Iwata, K., Zhao, X., et al.: Co-occurrence probability-based pixel pairs background model for robust object detection in dynamic scenes. Pattern Recognit. 48(4), 1374–1390 (2015)

    Article  Google Scholar 

  37. Lim, L.A., Keles, H.Y.: Foreground Segmentation Using a Triplet Convolutional Neural Network for Multiscale Feature Encoding (2018). arXiv preprint arXiv:1801.02225

  38. Lim, L.A., Keles, H.Y.: Learning Multi-scale Features for Foreground Segmentation (2018). arXiv preprint arXiv:1808.01477

  39. Liu, C., et al.: Beyond Pixels: Exploring New Representations and Applications for Motion Analysis. PhD thesis, Massachusetts Institute of Technology (2009)

  40. Lu, X.: A multiscale spatio-temporal background model for motion detection. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 3268–3271. IEEE (2014)

  41. Lu, X., Li, X.: Group sparse reconstruction for image segmentation. Neurocomputing 136, 41–48 (2014)

    Article  Google Scholar 

  42. Maddalena, L., Petrosino, A.: Towards benchmarking scene background initialization. In: International Conference on Image Analysis and Processing, pp. 469–476. Springer, Berlin (2015)

  43. Nakashima, Y., Babaguchi, N., Fan, J.: Automatic generation of privacy-protected videos using background estimation. In: 2011 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2011)

  44. Ortego, D., SanMiguel, J.C., Martínez, J.M.: Rejection based multipath reconstruction for background estimation in video sequences with stationary objects. Comput. Vis. Image Underst. 147, 23–37 (2016)

    Article  Google Scholar 

  45. Park, D., Byun, H.: A unified approach to background adaptation and initialization in public scenes. Pattern Recognit. 46(7), 1985–1997 (2013)

    Article  Google Scholar 

  46. Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 (2016)

  47. Pérez, P., Gangnet, M., Blake, A.: Poisson image editing. ACM Trans. Graph. 22(3), 313–318 (2003)

    Article  Google Scholar 

  48. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-CNN: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

  49. Shen, J., Jin, X., Zhou, C., Wang, C.C.: Gradient based image completion by solving the poisson equation. Comput. Graph. 31(1), 119–126 (2007)

    Article  Google Scholar 

  50. Shen, J., Hao, X., Liang, Z., Liu, Y., Wang, W., Shao, L.: Real-time superpixel segmentation by dbscan clustering algorithm. IEEE Trans. Image Process. 25(12), 5933–5942 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  51. Shen, J., Peng, J., Dong, X., Shao, L., Porikli, F.: Higher order energies for image segmentation. IEEE Trans. Image Process. 26(10), 4911–4922 (2017a)

    Article  MathSciNet  MATH  Google Scholar 

  52. Shen, J., Yu, D., Deng, L., Dong, X.: Fast online tracking with detection refinement. IEEE Trans. Intell. Transp. Syst. 19, 162–173 (2017b)

    Article  Google Scholar 

  53. Shen, J., Peng, J., Shao, L.: Submodular trajectories for better motion segmentation in videos. IEEE Trans. Image Process. 27(6), 2688–2700 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  54. Shimada, A., Nagahara, H., Taniguchi, R.I.: Background modeling based on bidirectional analysis. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1979–1986. IEEE (2013)

  55. Simonyan, K., Zisserman, A: Very Deep Convolutional Networks for Large-Scale Image Recognition (2014). arXiv preprint arXiv:1409.1556

  56. Sobral, A., Zahzah, Eh: Matrix and tensor completion algorithms for background model initialization: a comparative evaluation. Pattern Recognit. Lett. 96, 22–33 (2017)

    Article  Google Scholar 

  57. Sobral, A., Bouwmans, T., Zahzah, E.H.: Comparison of matrix completion algorithms for background initialization in videos. In: International Conference on Image Analysis and Processing, pp 510–518. Springer, Berlin (2015)

  58. Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 246–252. IEEE (1999)

  59. Tsai, C.C., Qian, X., Lin, Y.Y.: Segmentation guided local proposal fusion for co-saliency detection. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 523–528. IEEE (2017)

  60. Varadarajan, S., Miller, P., Zhou, H.: Spatial mixture of gaussians for dynamic background modelling. In: 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 63–68. IEEE (2013)

  61. Vaswani, N., Bouwmans, T., Javed, S., Narayanamurthy, P.: Robust PCA and Robust Subspace Tracking (2017). arXiv preprint arXiv:1711.09492

  62. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001 (CVPR 2001), vol. 1, pp. I-I. IEEE (2001)

  63. Wang, W., Shen, J., Shao, L.: Consistent video saliency using local gradient flow optimization and global refinement. IEEE Trans. Image Process. 24(11), 4185–4196 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  64. Wang, W., Shen, J., Sun, H., Shao, L.: Vicos2: video co-saliency guided co-segmentation. IEEE Trans. Circuits Syst. Video Technol. 28, 1727–1736 (2017)

    Article  Google Scholar 

  65. Wang, W., Shen, J., Ling, H.: A deep network solution for attention and aesthetics aware photo cropping. IEEE Trans. Pattern Anal. Mach. Intell. 1–16 (2018)

  66. Wang, W., Shen, J., Porikli, F., Yang, R.: Semi-supervised video object segmentation with super-trajectories. IEEE Trans. Pattern Anal. Mach. Intell. (2018)

  67. Wang, W., Shen, J., Shao, L.: Video salient object detection via fully convolutional networks. IEEE Trans. Image Process. 27(1), 38–49 (2018c)

    Article  MathSciNet  MATH  Google Scholar 

  68. Wang, W., Shen, J., Yang, R., Porikli, F.: Saliency-aware video object segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 1, 20–33 (2018d)

    Article  Google Scholar 

  69. Wang, Y., Jodoin, P.M., Porikli, F., Konrad, J., Benezeth, Y., Ishwar, P.: Cdnet 2014: an expanded change detection benchmark dataset. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 393–400. IEEE (2014)

  70. Wang, Y., Luo, Z., Jodoin, P.M.: Interactive deep learning method for segmenting moving objects. Pattern Recognit. Lett. 96, 66–75 (2017b)

    Article  Google Scholar 

  71. Wright, J., Ganesh, A., Rao, S., Peng, Y., Ma, Y.: Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. In: Advances in Neural Information Processing Systems, pp. 2080–2088. Curran Associates, Inc (2009)

  72. Xu, J., Ithapu, V., Mukherjee, L., Rehg ,J,. Singh, V.: Gosus: Grassmannian online subspace updates with structured-sparsity. In: ICCV (2013)

  73. Xu, J., Ithapu, V.K., Mukherjee, L., Rehg, J.M., Singh, V.: Gosus: Grassmannian online subspace updates with structured-sparsity. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 3376–3383. IEEE (2013)

  74. Xu, X, Huang, T.S.: A loopy belief propagation approach for robust background estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008 (CVPR 2008), pp. 1–7. IEEE (2008)

  75. Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., Li, H.: High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis (2016). arXiv preprint arXiv:1611.09969

  76. Ye, X., Yang, J., Sun, X., Li, K., Hou, C., Wang, Y.: Foreground-background separation from video clips via motion-assisted matrix restoration. IEEE Trans. Circuits Syst. Video Technol. 25(11), 1721–1734 (2015)

    Article  Google Scholar 

  77. Zhang, D., Han, J., Li, C., Wang, J., Li, X.: Detection of co-salient objects by looking deep and wide. Int. J. Comput. Vis. 120(2), 215–232 (2016)

    Article  MathSciNet  Google Scholar 

  78. Zhang, D., Meng, D., Han, J.: Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Trans. Pattern Anal. Mach. Intell. 39(5), 865–878 (2017)

    Article  Google Scholar 

  79. Zhang, D., Fu, H., Han, J., Borji, A., Li, X.: A review of co-saliency detection algorithms: fundamentals, applications, and challenges. ACM Trans. Intell. Syst. Technol. 9(4), 38 (2018)

    Article  Google Scholar 

  80. Zhang, T., Liu, S., Xu, C., Lu, H.: Mining semantic context information for intelligent video surveillance of traffic scenes. IEEE Trans. Ind. Inform. 9(1), 149–160 (2013)

    Article  Google Scholar 

  81. Zhang, T., Liu, S., Ahuja, N., Yang, M.H., Ghanem, B.: Robust visual tracking via consistent low-rank sparse learning. Int. J. Comput. Vis. 111(2), 171–190 (2015a)

    Article  MATH  Google Scholar 

  82. Zhang, Y., Li, X., Zhang, Z., Wu, F., Zhao, L.: Deep learning driven blockwise moving object detection with binary scene modeling. Neurocomputing 168, 454–463 (2015b)

    Article  Google Scholar 

  83. Zhao, Q., Zhou, G., Zhang, L., Cichocki, A., Amari, S.I.: Bayesian robust tensor factorization for incomplete multiway data. IEEE Trans. Neural Netw. Learn. Syst. 27(4), 736–748 (2016)

    Article  MathSciNet  Google Scholar 

  84. Zhou, T., Tao, D.: Godec: randomized low-rank and sparse matrix decomposition in noisy case. In: ICML. Omnipress (2011)

  85. Zhou, X., Yang, C., Yu, W.: Moving object detection by detecting contiguous outliers in the low-rank representation. IEEE T-PAMI 35(3), 597–610 (2013)

    Article  Google Scholar 

  86. Zivkovic, Z.: Improved adaptive gaussian mixture model for background subtraction. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004 (ICPR 2004), vol. 2, pp. 28–31. IEEE (2004)

Download references

Acknowledgements

This research was supported by Development project of leading technology for future vehicle of the business of Daegu metropolitan city (No. 20171105).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soon Ki Jung.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sultana, M., Mahmood, A., Javed, S. et al. Unsupervised deep context prediction for background estimation and foreground segmentation. Machine Vision and Applications 30, 375–395 (2019). https://doi.org/10.1007/s00138-018-0993-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-018-0993-0

Keywords

Navigation