Skip to main content

Unsupervised Domain Attention Adaptation Network for Caricature Attribute Recognition

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12353))

Included in the following conference series:

Abstract

Caricature attributes provide distinctive facial features to help research in Psychology and Neuroscience. However, unlike the facial photo attribute datasets that have a quantity of annotated images, the annotations of caricature attributes are rare. To facility the research in attribute learning of caricatures, we propose a caricature attribute dataset, namely WebCariA. Moreover, to utilize models that trained by face attributes, we propose a novel unsupervised domain adaptation framework for cross-modality (i.e., photos to caricatures) attribute recognition, with an integrated inter- and intra-domain consistency learning scheme. Specifically, the inter-domain consistency learning scheme consisting an image-to-image translator to first fill the domain gap between photos and caricatures by generating intermediate image samples, and a label consistency learning module to align their semantic information. The intra-domain consistency learning scheme integrates the common feature consistency learning module with a novel attribute-aware attention-consistency learning module for a more efficient alignment. We did an extensive ablation study to show the effectiveness of the proposed method. And the proposed method also outperforms the state-of-the-art methods by a margin. The implementation of the proposed method is available at https://github.com/KeleiHe/DAAN.

W. Ji and K. He—These authors contributed equally as co-first authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://cs.nju.edu.cn/huojing/WebCariA.htm.

References

  1. Abaci, B., Akgul, T.: Matching caricatures to photographs. Signal Image Video Process. 9(1), 295–303 (2015). https://doi.org/10.1007/s11760-015-0819-8

    Article  Google Scholar 

  2. Abdulnabi, A.H., Wang, G., Lu, J., Jia, K.: Multi-task CNN model for attribute prediction. IEEE Trans. Multimed. 17(11), 1949–1959 (2015)

    Article  Google Scholar 

  3. Brennan, S.E.: Caricature generator: the dynamic exaggeration of faces by computer. Leonardo 40(4), 392–400 (2007)

    Article  Google Scholar 

  4. Cao, K., Liao, J., Yuan, L.: Carigans: unpaired photo-to-caricature translation. ACM Trans. Graph. 37(6), 244 (2018)

    Google Scholar 

  5. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)

    Article  Google Scholar 

  6. Csurka, G.: Domain adaptation for visual applications: a comprehensive survey. arXiv preprint arXiv:1702.05374 (2017)

  7. Ding, H., Zhou, H., Zhou, S.K., Chellappa, R.: A deep cascade network for unaligned face attribute classification. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  8. Ehrlich, M., Shields, T.J., Almaev, T., Amer, M.R.: Facial attributes classification using multi-task representation learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 47–55 (2016)

    Google Scholar 

  9. Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: International Conference on Machine Learning, pp. 1180–1189 (2015)

    Google Scholar 

  10. Geng, X., Yin, C., Zhou, Z.H.: Facial age estimation by learning from label distributions. IEEE Trans. Pattern Anal. Mach. Intell. 35(10), 2401–2412 (2013)

    Article  Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  12. He, K., Wang, Z., Fu, Y., Feng, R., Jiang, Y.G., Xue, X.: Adaptively weighted multi-task deep network for person attribute classification. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1636–1644. ACM (2017)

    Google Scholar 

  13. Hoffman, J., et al.: Cycada: cycle-consistent adversarial domain adaptation. In: Proceedings of the 35th International Conference on Machine Learning (2018)

    Google Scholar 

  14. Huo, J., Li, W., Shi, Y., Gao, Y., Yin, H.: Webcaricature: a benchmark for caricature recognition. arXiv preprint arXiv:1703.03230 (2017)

  15. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)

    Google Scholar 

  16. Jacob, L., Philippe Vert, J., Bach, F.R.: Clustered multi-task learning: a convex formulation. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 21, pp. 745–752. Curran Associates, Inc. (2009). http://papers.nips.cc/paper/3499-clustered-multi-task-learning-a-convex-formulation.pdf

  17. Kim, J., Kim, M., Kang, H., Lee, K.H.: U-gat-it: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In: International Conference on Learning Representations (2019)

    Google Scholar 

  18. Klare, B.F., Bucak, S.S., Jain, A.K., Akgul, T.: Towards automated caricature recognition. In: 2012 5th IAPR International Conference on Biometrics (ICB), pp. 139–146. IEEE (2012)

    Google Scholar 

  19. Kumar, A., Daume III, H.: Learning task grouping and overlap in multi-task learning. In: ICML (2012)

    Google Scholar 

  20. Lee, S., Kim, D., Kim, N., Jeong, S.G.: Drop to adapt: learning discriminative features for unsupervised domain adaptation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 91–100 (2019)

    Google Scholar 

  21. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3730–3738 (2015)

    Google Scholar 

  22. Long, M., Cao, Y., Wang, J., Jordan, M.: Learning transferable features with deep adaptation networks. In: International Conference on Machine Learning, pp. 97–105 (2015)

    Google Scholar 

  23. Lu, Y., Kumar, A., Zhai, S., Cheng, Y., Javidi, T., Feris, R.: Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5334–5343 (2017)

    Google Scholar 

  24. Luo, P., Wang, X., Tang, X.: Hierarchical face parsing via deep learning. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2480–2487. IEEE (2012)

    Google Scholar 

  25. Luo, P., Wang, X., Tang, X.: A deep sum-product architecture for robust facial attributes analysis. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2864–2871 (2013)

    Google Scholar 

  26. Mauro, R., Kubovy, M.: Caricature and face recognition. Mem. Cogn. 20(4), 433–440 (1992)

    Article  Google Scholar 

  27. Perkins, D.: A definition of caricature and caricature and recognition. Stud. Vis. Commun. 2(1), 1–24 (1975)

    Google Scholar 

  28. Rudd, E.M., Günther, M., Boult, T.E.: MOON: a mixed objective optimization network for the recognition of facial attributes. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 19–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_2

    Chapter  Google Scholar 

  29. Russo, P., Carlucci, F.M., Tommasi, T., Caputo, B.: From source to target and back: symmetric bi-directional adaptive GAN. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018

    Google Scholar 

  30. Saito, K., Watanabe, K., Ushiku, Y., Harada, T.: Maximum classifier discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3723–3732 (2018)

    Google Scholar 

  31. Smith, V., Chiang, C.K., Sanjabi, M., Talwalkar, A.S.: Federated multi-task learning. In: Advances in Neural Information Processing Systems, pp. 4424–4434 (2017)

    Google Scholar 

  32. Tsai, Y.H., Hung, W.C., Schulter, S., Sohn, K., Yang, M.H., Chandraker, M.: Learning to adapt structured output space for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7472–7481 (2018)

    Google Scholar 

  33. Valentine, T., Lewis, M.B., Hills, P.J.: Face-space: a unifying concept in face recognition research. Quart. J. Exp. Psychol. 69(10), 1996–2019 (2016)

    Article  Google Scholar 

  34. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  35. Vázquez, D., López, A.M., Ponsa, D.: Unsupervised domain adaptation of virtual and real worlds for pedestrian detection. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR 2012), pp. 3492–3495. IEEE (2012)

    Google Scholar 

  36. Wang, X., Guo, R., Kambhamettu, C.: Deeply-learned feature for age estimation. In: 2015 IEEE Winter Conference on Applications of Computer Vision, pp. 534–541. IEEE (2015)

    Google Scholar 

  37. Wang, Z., He, K., Fu, Y., Feng, R., Jiang, Y.G., Xue, X.: Multi-task deep neural network for joint face recognition and facial attribute prediction. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pp. 365–374. ACM (2017)

    Google Scholar 

  38. Zhang, Y., Shen, W., Sun, L., Li, Q.: Position-squeeze and excitation module for facial attribute analysis. In: BMVC (2018)

    Google Scholar 

  39. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)

    Google Scholar 

  40. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)

    Google Scholar 

  41. Zhu, Z., Luo, P., Wang, X., Tang, X.: Multi-view perceptron: a deep model for learning face identity and view representations. In: Advances in Neural Information Processing Systems, pp. 217–225 (2014)

    Google Scholar 

Download references

Acknowledgement

This work is supported in part by National Science Foundation of China under Grant No. 61806092, and in part by Jiangsu Natural Science Foundation under Grant No. BK20180326.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Kelei He or Jing Huo .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 862 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ji, W., He, K., Huo, J., Gu, Z., Gao, Y. (2020). Unsupervised Domain Attention Adaptation Network for Caricature Attribute Recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12353. Springer, Cham. https://doi.org/10.1007/978-3-030-58598-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58598-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58597-6

  • Online ISBN: 978-3-030-58598-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics