Betrayed by Motion: Camouflaged Object Discovery via Motion Segmentation

Lamdouar, Hala; Yang, Charig; Xie, Weidi; Zisserman, Andrew

doi:10.1007/978-3-030-69532-3_30

Hala Lamdouar¹²,
Charig Yang¹²,
Weidi Xie¹² &
…
Andrew Zisserman¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12623))

Included in the following conference series:

Asian Conference on Computer Vision

909 Accesses
14 Citations

Abstract

The objective of this paper is to design a computational architecture that discovers camouflaged objects in videos, specifically by exploiting motion information to perform object segmentation. We make the following three contributions: (i) We propose a novel architecture that consists of two essential components for breaking camouflage, namely, a differentiable registration module to align consecutive frames based on the background, which effectively emphasises the object boundary in the difference image, and a motion segmentation module with memory that discovers the moving objects, while maintaining the object permanence even when motion is absent at some point. (ii) We collect the first large-scale Moving Camouflaged Animals (MoCA) video dataset, which consists of over 140 clips across a diverse range of animals (67 categories). (iii) We demonstrate the effectiveness of the proposed model on MoCA, and achieve competitive performance on the unsupervised segmentation protocol on DAVIS2016 by only relying on motion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Goodale, M.A., Milner, A.D.: Separate visual pathways for perception and action. Trends Neurosci. 15, 20–25 (1992)
Article Google Scholar
Tokmakov, P., Schmid, C., Alahari, K.: Learning to segment moving objects. IJCV 127, 282–301 (2019)
Google Scholar
Bideau, P., Learned-Miller, E.: A detailed rubric for motion segmentation. arXiv preprint arXiv:1610.10033 (2016)
Pont-Tuset, J., Perazzi, F., Caelles, S., Arbeláez, P., Sorkine-Hornung, A., Gool, L.V.: The 2017 Davis challenge on video object segmentation. arXiv preprint arXiv:1704.00675 (2017)
Xu, N., et al.: YouTube-VOS: a large-scale video object segmentation benchmark. In: Proceedings of ECCV (2018)
Google Scholar
Brox, T., Malik, J.: Object segmentation by long term analysis of point trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 282–295. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_21
Chapter Google Scholar
Ochs, P., Brox, T.: Object segmentation in video: a hierarchical variational approach for turning point trajectories into dense regions. In: Proceedings of ICCV (2011)
Google Scholar
Papazoglou, A., Ferrari, V.: Fast object segmentation in unconstrained video. In: Proceedings of ICCV (2013)
Google Scholar
Jain, S.D., Xiong, B., Grauman, K.: FusionSeg: learning to combine motion and appearance for fully automatic segmentation of generic objects in videos. In: Proceedings of CVPR (2017)
Google Scholar
Dave, A., Tokmakov, P., Ramanan, D.: Towards segmenting anything that moves. In: ICCV Workshop on Holistic Video Understanding (2019)
Google Scholar
Oh, S.W., Lee, J.Y., Xu, N., Kim, S.J.: Video object segmentation using space-time memory networks. In: Proceedings of ICCV (2019)
Google Scholar
Voigtlaender, P., Chai, Y., Schroff, F., Adam, H., Leibe, B., Chen, L.C.: FEELVOS: fast end-to-end embedding learning for video object segmentation. In: Proceedings of CVPR (2019)
Google Scholar
Vondrick, C., Shrivastava, A., Fathi, A., Guadarrama, S., Murphy, K.: Tracking emerges by colorizing videos. In: ECCV (2018)
Google Scholar
Wang, W., Lu, X., Shen, J., Crandall, D.J., Shao, L.: Zero-shot video object segmentation via attentive graph neural networks. In: Proceedings of ICCV (2019)
Google Scholar
Lai, Z., Xie, W.: Self-supervised learning for video correspondence flow. In: Proceedings of BMVC (2019)
Google Scholar
Lai, Z., Lu, E., Xie, W.: MAST: a memory-augmented self-supervised tracker. In: Proceedings of CVPR (2020)
Google Scholar
Maninis, K.K., et al.: Video object segmentation without temporal information. IEEE Trans. Pattern Anal. Mach. Intell. 41, 1515-1530 (2018)
Google Scholar
Voigtlaender, P., Leibe, B.: Online adaptation of convolutional neural networks for video object segmentation. arXiv (2017)
Google Scholar
Caelles, S., Maninis, K.K., Pont-Tuset, J., Leal-Taixé, L., Cremers, D., Van Gool, L.: One-shot video object segmentation. In: Proceedings of CVPR (2017)
Google Scholar
Fragkiadaki, K., Zhang, G., Shi, J.: Video segmentation by tracing discontinuities in a trajectory embedding. In: Proceedings of CVPR (2012)
Google Scholar
Keuper, M., Andres, B., Brox, T.: Motion trajectory segmentation via minimum cost multicuts. In: Proceedings of ICCV (2015)
Google Scholar
Yang, Z., Wang, Q., Bertinetto, L., Bai, S., Hu, W., Torr, P.H.: Anchor diffusion for unsupervised video object segmentation. In: Proceedings of ICCV (2019)
Google Scholar
Xiankai, L., Wenguan, W., Chao, M., Jianbing, S., Ling, S., Fatih, P.: See more, know more: unsupervised video object segmentation with co-attention Siamese networks. In: Proceedings of CVPR (2019)
Google Scholar
Koh, Y.J., Kim, C.S.: Primary object segmentation in videos based on region augmentation and reduction. In: Proceedings of CVPR (2017)
Google Scholar
Fan, D.P., Wang, W., Cheng, M.M., Shen, J.: Shifting more attention to video salient object detection. In: Proceedings of CVPR (2019)
Google Scholar
Le, T.N., Nguyen, T.V., Nie, Z., Tran, M.T., Sugimoto, A.: Anabranch network for camouflaged object segmentation. CVIU 184, 45–56 (2016)
Google Scholar
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004). ISBN:0521540518
Google Scholar
Szeliski, R.: Image alignment and stitching: a tutorial. Technical report MSR-TR-2004-92 (2004)
Google Scholar
Lowe, D.: Object recognition from local scale-invariant features. In: Proceedings of ICCV, pp. 1150–1157 (1999)
Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 381–395 (1981)
Article MathSciNet Google Scholar
Brachmann, E., et al.: DSAC-differentiable RANSAC for camera localization. In: Proceedings of CVPR (2017)
Google Scholar
Brachmann, E., Rother, C.: Learning less is more-6d camera localization via 3d surface regression. In: Proceedings of CVPR (2018)
Google Scholar
Ranftl, R., Koltun, V.: Deep fundamental matrix estimation. In: Proceedings of ECCV (2018)
Google Scholar
Rocco, I., Arandjelovic, R., Sivic, J.: End-to-end weakly-supervised semantic alignment. In: Proceedings of CVPR (2018)
Google Scholar
Brachmann, E., Rother, C.: Neural-guided RANSAC: learning where to sample model hypotheses. In: Proceedings of ICCV (2019)
Google Scholar
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of CVPR (2018)
Google Scholar
Yi, K.M., Trulls, E., Ono, Y., Lepetit, V., Salzmann, M., Fua, P.: Learning to find good correspondences. In: Proceedings of CVPR (2018)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Ballas, N., Yao, L., Pal, C., Courville, A.: Delving deeper into convolutional networks for learning video representations. In: Proceedings of ICLR (2016)
Google Scholar
Bideau, P., Learned-Miller, E.: It’s moving! a probabilistic model for causal motion segmentation in moving camera videos. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 433–449. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_26
Chapter Google Scholar
Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of CVPR (2016)
Google Scholar
Tokmakov, P., Alahari, K., Schmid, C.: Learning motion patterns in videos. In: Proceedings of CVPR (2017)
Google Scholar
Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of CVPR (2016)
Google Scholar
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep network. In: Proceedings of CVPR (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NIPS (2017)
Google Scholar

Download references

Acknowledgements

This research was supported by the UK EPSRC CDT in AIMS, Schlumberger Studentship, and the UK EPSRC Programme Grant Seebibyte EP/M013774/1.

Author information

Authors and Affiliations

Visual Geometry Group, University of Oxford, Oxford, UK
Hala Lamdouar, Charig Yang, Weidi Xie & Andrew Zisserman

Authors

Hala Lamdouar
View author publications
You can also search for this author in PubMed Google Scholar
Charig Yang
View author publications
You can also search for this author in PubMed Google Scholar
Weidi Xie
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Zisserman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hala Lamdouar .

Editor information

Editors and Affiliations

Waseda University, Tokyo, Japan
Hiroshi Ishikawa
Institute of Automation of Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
Czech Technical University in Prague, Prague, Czech Republic
Tomas Pajdla
University of Pennsylvania, Philadelphia, PA, USA
Jianbo Shi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lamdouar, H., Yang, C., Xie, W., Zisserman, A. (2021). Betrayed by Motion: Camouflaged Object Discovery via Motion Segmentation. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12623. Springer, Cham. https://doi.org/10.1007/978-3-030-69532-3_30

Download citation

DOI: https://doi.org/10.1007/978-3-030-69532-3_30
Published: 27 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69531-6
Online ISBN: 978-3-030-69532-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics