Skip to main content
Log in

A survey on face data augmentation for the training of deep neural networks

  • Review
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The quality and size of training set have a great impact on the results of deep learning-based face-related tasks. However, collecting and labeling adequate samples with high-quality and balanced distributions still remains a laborious and expensive work, and various data augmentation techniques have thus been widely used to enrich the training dataset. In this paper, we review the existing works of face data augmentation from the perspectives of the transformation types and methods, with the state-of-the-art approaches involved. Among all these approaches, we put the emphasis on the deep learning-based works, especially the generative adversarial networks which have been recognized as more powerful and effective tools in recent years. We present their principles, discuss the results and show their applications as well as limitations. Different evaluation metrics for evaluating these approaches are also introduced. We point out the challenges and opportunities in the field of face data augmentation and provide brief yet insightful discussions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Agianpuye S, Minoi JL (2013) 3d facial expression synthesis: a survey. In: 2013 8th international conference on information technology in Asia (CITA). IEEE, pp 1–7

  2. Alashkar T, Jiang S, Wang S, Fu Y (2017) Examples-rules guided deep neural network for makeup recommendation. In: Thirty-first AAAI conference on artificial intelligence

  3. Alhaija HA, Mustikovela SK, Mescheder L, Geiger A, Rother C (2017) Augmented reality meets deep learning for car instance segmentation in urban scenes. In: British machine vision conference, vol 1, p 2

  4. Antipov G, Baccouche M, Dugelay JL (2017) Face aging with conditional generative adversarial networks. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 2089–2093

  5. Antoniou A, Storkey A, Edwards H (2017) Data augmentation generative adversarial networks. ArXiv preprint arXiv:1711.04340

  6. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: International conference on machine learning, pp 214–223

  7. Azevedo P, Dos Santos TO, De Aguiar E (2016) An augmented reality virtual glasses try-on system. In: 2016 XVIII symposium on virtual and augmented reality (SVR). IEEE, pp 1–9

  8. Banerjee S, Bernhard JS, Scheirer WJ, Bowyer KW, Flynn PJ (2017) Srefi: synthesis of realistic example face images. In: 2017 IEEE international joint conference on biometrics (IJCB). IEEE, pp 37–45

  9. Banerjee S, Scheirer WJ, Bowyer KW, Flynn, PJ (2018) On hallucinating context and background pixels from a face mask using multi-scale gans. ArXiv preprint arXiv:1811.07104

  10. Bao J, Chen D, Wen F, Li H, Hua G (2017) Cvae-gan: fine-grained image generation through asymmetric training. In: Proceedings of the IEEE international conference on computer vision, pp 2745–2754

  11. Bao J, Chen D, Wen F, Li H, Hua G (2018) Towards open-set identity preserving face synthesis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6713–6722

  12. Blanz V, Vetter T (2003) Face recognition based on fitting a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 25(9):1063–1074

    Article  Google Scholar 

  13. Blanz V, Vetter T et al (1999) A morphable model for the synthesis of 3d faces. Siggraph 99:187–194

    Google Scholar 

  14. Cao J, Hu Y, Yu B, He R, Sun Z (2018) Load balanced gans for multi-view face image synthesis. ArXiv preprint arXiv:1802.07447

  15. Cao J, Hu Y, Zhang H, He R, Sun Z (2018) Learning a high fidelity pose invariant model for high-resolution face frontalization. In: Advances in neural information processing systems, pp 2872–2882

  16. Chang H, Lu J, Yu F, Finkelstein A (2018) Pairedcyclegan: asymmetric style transfer for applying and removing makeup. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 40–48

  17. Chen W, Xie X, Jia X, Shen L (2018) Texture deformation based generative adversarial networks for face editing. ArXiv preprint arXiv:1812.09832

  18. Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In: Advances in neural information processing systems, pp 2172–2180

  19. Choi Y, Choi M, Kim M, Ha JW, Kim S, Choo J (2018) Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8789–8797

  20. Cootes TF, Edwards GJ, Taylor CJ (2001) Active appearance models. IEEE Trans Pattern Anal Mach Intell 6:681–685

    Article  Google Scholar 

  21. Crispell D, Biris O, Crosswhite N, Byrne J, Mundy JL (2017) Dataset augmentation for pose and lighting invariant face recognition. ArXiv preprint arXiv:1704.04326

  22. Cubuk E.D, Zoph B, Mane D, Vasudevan V, Le QV (2019) Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 113–123

  23. Das D, Lee CSG (2018) Graph matching and pseudo-label guided deep unsupervised domain adaptation. In: International conference on artificial neural networks, pp 342–352

    Google Scholar 

  24. Das D, Lee CSG (2018) Sample-to-sample correspondence for unsupervised domain adaptation. Eng Appl Artif Intell 73:80–91

    Article  Google Scholar 

  25. Das D, Lee CSG (2018) Unsupervised domain adaptation using regularized hyper-graph matching. In: Computer vision and pattern recognition

  26. Das D, Lee CSG (2019) Zero-shot image recognition using relational matching, adaptation and calibration. In: Computer vision and pattern recognition

  27. Deng J, Cheng S, Xue N, Zhou Y, Zafeiriou S (2018) Uv-gan: Adversarial facial uv map completion for pose-invariant face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7093–7102

  28. Di X, Sindagi VA, Patel VM (2018) Gp-gan: Gender preserving gan for synthesizing faces from landmarks. In: 2018 24th international conference on pattern recognition (ICPR). IEEE, pp 1079–1084

  29. Ding H, Sricharan K, Chellappa R (2018) Exprgan: facial expression editing with controllable expression intensity. In: Thirty-second AAAI conference on artificial intelligence

  30. Dinh L, Krueger D, Bengio Y (2014) Nice: non-linear independent components estimation. ArXiv preprint arXiv:1410.8516

  31. Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, Van Der Smagt P, Cremers D, Brox T (2015) Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2758–2766

  32. Faceapp: Transform your face. https://www.faceapp.com/ (2018)

  33. Feng ZH, Hu G, Kittler J, Christmas W, Wu XJ (2015) Cascaded collaborative regression for robust facial landmark detection trained using a mixture of synthetic and real images with dynamic weighting. IEEE Trans Image Process 24(11):3425–3440

    Article  MathSciNet  Google Scholar 

  34. Feng ZH, Kittler J, Christmas W, Huber P, Wu XJ (2017) Dynamic attention-controlled cascaded shape regression exploiting training data augmentation and fuzzy-set sample weighting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2481–2490

  35. Flynn M (2016) Generating faces with deconvolution networks. https://zo7.github.io/blog/2016/09/25/generating-faces.html

  36. Gecer B, Bhattarai B, Kittler J, Kim TK (2018) Semi-supervised adversarial learning to generate photorealistic face images of new identities from 3d morphable model. In: Proceedings of the European conference on computer vision (ECCV), pp 217–234

  37. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  38. Grover A, Dhar M, Ermon S (2018) Flow-gan: Combining maximum likelihood and adversarial learning in generative models. In: Thirty-second AAAI conference on artificial intelligence

  39. Gu G, Kim ST, Kim K, Baddar WJ, Ro YM (2017) Differential generative adversarial networks: synthesizing non-linear facial variations with limited number of training data. ArXiv preprint arXiv:1711.10267

  40. Guan S (2018) Tl-gan: transparent latent-space gan. https://github.com/SummitKwan/transparent_latent_gan

  41. Gulrajani I, Kumar K, Ahmed F, Taiga AA, Visin F, Vazquez D, Courville A (2016) Pixelvae: a latent variable model for natural images. ArXiv preprint arXiv:1611.05013

  42. Guo D, Sim T (2009) Digital face makeup by example. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 73–79

  43. Guo J, Zhu X, Lei Z, Li SZ (2018) Face synthesis for eyeglass-robust face recognition. In: Chinese conference on biometric recognition. Springer, pp 275–284

  44. Guo Y, Cai J, Jiang B, Zheng J et al (2018) Cnn-based real-time dense face reconstruction with inverse-rendered photo-realistic face images. IEEE Trans Pattern Anal Mach Intell 41(6):1294–1307

    Article  Google Scholar 

  45. Hassner T, Harel S, Paz E, Enbar R (2015) Effective face frontalization in unconstrained images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4295–4304

  46. He Z, Zuo W, Kan M, Shan S, Chen X (2019) Attgan: facial attribute editing by only changing what you want. IEEE Trans Image Process 28(11):5464–5478

    Article  MathSciNet  Google Scholar 

  47. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, pp 6626–6637

  48. Hong S, Im W, Ryu J, Yang HS (2017) Sspp-dan: Deep domain adaptation network for face recognition with single sample per person. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 825–829

  49. Hu G, Peng X, Yang Y, Hospedales TM, Verbeek J (2018) Frankenstein: learning deep face representations using small data. IEEE Trans Image Process 27(1):293–303

    Article  MathSciNet  Google Scholar 

  50. Hu G, Yan F, Chan C.H, Deng W, Christmas W, Kittler J, Robertson NM (2016) Face recognition using a unified 3d morphable model. In: European conference on computer vision. Springer, pp 73–89

  51. Hu Y, Wu X, Yu B, He R, Sun Z (2018) Pose-guided photorealistic face rotation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8398–8406

  52. Huang H, He R, Sun Z, Tan T et al (2018) Introvae: introspective variational autoencoders for photographic image synthesis. In: Advances in neural information processing systems, pp 52–63

  53. Huang H, Yu PS, Wang C (2018) An introduction to image synthesis with generative adversarial nets. ArXiv preprint arXiv:1803.04469

  54. Huang R, Zhang S, Li T, He R (2017) Beyond face rotation: global and local perception gan for photorealistic and identity preserving frontal view synthesis. In: Proceedings of the IEEE international conference on computer vision, pp 2439–2448

  55. Huber P, Hu G, Tena R, Mortazavian P, Koppen P, Christmas WJ, Ratsch M, Kittler J (2016) A multiresolution 3d morphable face model and fitting framework. In: Proceedings of the 11th international joint conference on computer vision, imaging and computer graphics theory and applications

  56. Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134

  57. Javornik A, Rogers Y, Moutinho AM, Freeman R (2016) Revealing the shopper experience of using a“ magic mirror” augmented reality make-up application. In: Conference on designing interactive systems, vol 2016. Association for Computing Machinery (ACM), pp 871–882

  58. Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision. Springer, pp 694–711

  59. Juefei-Xu F, Dey R, Bodetti V, Savvides M (2018) Rankgan: a maximum margin ranking gan for generating faces. In: Proceedings of the Asian conference on computer vision (ACCV), vol 4

  60. Jung A (2017) imgaug. https://github.com/aleju/imgaug

  61. Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of gans for improved quality, stability, and variation. In: ICLR

  62. Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4401–4410

  63. Kemelmacher-Shlizerman I (2016) Transfiguring portraits. ACM Trans Graph (TOG) 35(4):94

    Article  Google Scholar 

  64. Kemelmacher-Shlizerman I, Suwajanakorn S, Seitz SM (2014) Illumination-aware age progression. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3334–3341

  65. Kim D, Hernandez M, Choi J, Medioni G (2017) Deep 3d face identification. In: 2017 IEEE international joint conference on biometrics (IJCB). IEEE, pp 133–142

  66. Kim T, Cha M, Kim H, Lee JK, Kim J (2017) Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of the 34th international conference on machine learning, volume 70, pp 1857–1865. JMLR.org

  67. Kim T, Kim B, Cha M, Kim J (2017) Unsupervised visual attribute transfer with reconfigurable generative adversarial networks. ArXiv preprint arXiv:1707.09798

  68. Kingma DP, Dhariwal P (2018) Glow: Generative flow with invertible 1x1 convolutions. In: Advances in neural information processing systems, pp 10236–10245

  69. Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: Proceedings of the 2nd international conference on learning representations (ICLR)

  70. Kitanovski V, Izquierdo E (2011) Augmented reality mirror for virtual facial alterations. In: 2011 18th IEEE international conference on image processing. IEEE, pp 1093–1096

  71. Kortylewski A, Schneider A, Gerig T, Egger B, Morel-Forster A, Vetter T (2018) Training deep face recognition systems with synthetic data. ArXiv preprint arXiv:1802.05891

  72. Kossaifi J, Tran L, Panagakis Y, Pantic M (2018) Gagan: geometry-aware generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 878–887

  73. Kulkarni TD, Whitney WF, Kohli P, Tenenbaum J (2015) Deep convolutional inverse graphics network. In: Advances in neural information processing systems, pp 2539–2547

  74. Lample G, Zeghidour N, Usunier N, Bordes A, Denoyer L et al (2017) Fader networks: manipulating images by sliding attributes. In: Advances in neural information processing systems, pp 5967–5976

  75. Larsen ABL, Sønderby SK, Larochelle H, Winther O (2015) Autoencoding beyond pixels using a learned similarity metric. In: 33rd International conference on machine learning

  76. Lee JY, Kang HB (2016) A new digital face makeup method. In: 2016 IEEE international conference on consumer electronics (ICCE). IEEE, pp 129–130

  77. Lemley J, Bazrafkan S, Corcoran P (2017) Smart augmentation learning an optimal data augmentation strategy. IEEE Access 5:5858–5869

    Article  Google Scholar 

  78. Leng B, Yu K, Jingyan Q (2017) Data augmentation for unbalanced face recognition training sets. Neurocomputing 235:10–14

    Article  Google Scholar 

  79. Li L, Peng Y, Qiu G, Sun Z, Liu S (2018) A survey of virtual sample generation technology for face recognition. Artif Intell Rev 50(1):1–20

    Article  Google Scholar 

  80. Li M, Zuo W, Zhang D (2016) Deep identity-aware transfer of facial attributes. ArXiv preprint arXiv:1610.05586

  81. Li P, Hu Y, He R, Sun Z (2019) Global and local consistent wavelet-domain age synthesis. IEEE Trans Inf Forensics Secur 14(11):2943–2957

    Article  Google Scholar 

  82. Li T, Qian R, Dong C, Liu S, Yan Q, Zhu W, Lin L (2018) Beautygan: instance-level facial makeup transfer with deep generative adversarial network. In: 2018 ACM multimedia conference on multimedia conference. ACM, pp 645–653

  83. Liu B, Wang X, Dixit M, Kwitt R, Vasconcelos N (2018) Feature space transfer for data augmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9090–9098

  84. Liu F, Zhu R, Zeng D, Zhao Q, Liu X (2018) Disentangling features in 3d face shapes for joint face reconstruction and recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5216–5225

  85. Liu L, Xing J, Liu S, Xu H, Zhou X, Yan S (2014) Wow! you are so beautiful today!. ACM Trans Multimed Comput Commun Appl (TOMM) 11(1s):20

    Google Scholar 

  86. Liu L, Zhang H, Ji Y, Wu QJ (2019) Toward ai fashion design: an attribute-gan model for clothing match. Neurocomputing 341:156–167

    Article  Google Scholar 

  87. Liu MY, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. In: Advances in neural information processing systems, pp 700–708

  88. Liu MY, Tuzel O (2016) Coupled generative adversarial networks. In: Advances in neural information processing systems, pp 469–477

  89. Liu S, Ou X, Qian R, Wang W, Cao X (2016) Makeup like a superstar: deep localized makeup transfer network. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence, IJCAI’16. AAAI Press, pp 2568–2575

  90. Liu Y, Li Q, Sun Z (2019) Attribute enhanced face aging with wavelet-based generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  91. Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of the IEEE international conference on computer vision, pp 3730–3738

  92. Lu Y, Tai YW, Tang CK (2018) Attribute-guided face generation using conditional cyclegan. In: Proceedings of the European conference on computer vision (ECCV), pp 282–297

  93. Lv JJ, Cheng C, Tian GD, Zhou XD, Zhou X (2016) Landmark perturbation-based data augmentation for unconstrained face recognition. Signal Process Image Commun 47:465–475

    Article  Google Scholar 

  94. Lv JJ, Shao XH, Huang JS, Zhou XD, Zhou X (2017) Data augmentation for face recognition. Neurocomputing 230:184–196

    Article  Google Scholar 

  95. Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B (2015) Adversarial autoencoders. ArXiv preprint arXiv:1511.05644

  96. Mash R, Borghetti B, Pecarina J (2016) Improved aircraft recognition for aerial refueling through data augmentation in convolutional neural networks. In: International symposium on visual computing. Springer, pp 113–122

  97. Masi I, Rawls S, Medioni G, Natarajan P (2016) Pose-aware face recognition in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4838–4846

  98. Masi I, Trãn AT, Hassner T, Leksut JT, Medioni G (2016) Do we really need to collect millions of faces for effective face recognition? In: European conference on computer vision. Springer, pp 579–596

  99. Matthews I, Xiao J, Baker S (2007) 2d vs. 3d deformable face models: representational power, construction, and real-time fitting. Int J Comput Vision 75(1):93–113

    Article  Google Scholar 

  100. Menze M, Geiger, A (2015) Object scene flow for autonomous vehicles. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3061–3070

  101. Mirza M, Osindero, S (2014) Conditional generative adversarial nets. ArXiv preprint arXiv:1411.1784

  102. Moniz JRA, Beckham C, Rajotte S, Honari S, Pal C (2018) Unsupervised depth estimation, 3d face rotation and replacement. In: Advances in neural information processing systems, pp 9759–9769

  103. Nguyen TV, Liu L (2017) Smart mirror: Intelligent makeup recommendation and synthesis. In: Proceedings of the 25th ACM international conference on multimedia. ACM, pp 1253–1254

  104. Oo WY (2016) Digital makeup face generation. https://web.stanford.edu/class/ee368/Project_Autumn_1516/Reports/Oo.pdf

  105. Palsson S, Agustsson E, Timofte R, Van Gool L (2018) Generative adversarial style transfer networks for face aging. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 2084–2092

  106. Pandey G, Dukkipati A (2017) Variational methods for conditional multimodal deep learning. In: 2017 international joint conference on neural networks (IJCNN). IEEE, pp 308–315

  107. Parkhi OM, Vedaldi A, Zisserman A et al (2015) Deep face recognition. In: The British machine vision conference (BMVC), vol 1, p 6

  108. Perarnau G, Van De Weijer J, Raducanu B, Álvarez JM (2016) Invertible conditional gans for image editing. In: NIPS 2016 workshop on adversarial training

  109. Pham HX, Wang Y, Pavlovic V (2018) Generative adversarial talking head: Bringing portraits to life with a weakly supervised neural network. ArXiv preprint arXiv:1803.07716

  110. Pumarola A, Agudo A, Martinez AM, Sanfeliu A, Moreno-Noguer F (2018) Ganimation: anatomically-aware facial animation from a single image. In: Proceedings of the European conference on computer vision (ECCV), pp 818–833

  111. Qiao F, Yao N, Jiao Z, Li Z, Chen H, Wang H (2018) Geometry-contrastive gan for facial expression transfer. ArXiv preprint arXiv:1802.01822

  112. Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. ArXiv preprint arXiv:1511.06434

  113. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, pp 2234–2242

  114. Salimans T, Karpathy A, Chen X, Kingma DP (2017) Pixelcnn++: a pixelcnn implementation with discretized logistic mixture likelihood and other modifications. In: ICLR

  115. Sanchez E, Valstar M (2018) Triple consistency loss for pairing distributions in gan-based face synthesis. ArXiv preprint arXiv:1811.03492

  116. Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823

  117. Shen W, Liu R (2017) Learning residual images for face attribute manipulation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4030–4038

  118. Shen Y, Luo P, Yan J, Wang X, Tang X (2018) Faceid-gan: learning a symmetry three-player gan for identity-preserving face synthesis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 821–830

  119. Shen Y, Zhou B, Luo P, Tang X (2018) Facefeat-gan: a two-stage approach for identity-preserving face synthesis. ArXiv preprint arXiv:1812.01288

  120. Shrivastava A, Pfister T, Tuzel O, Susskind J, Wang W, Webb R (2017) Learning from simulated and unsupervised images through adversarial training. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2107–2116

  121. Shu X, Tang J, Lai H, Liu L, Yan S (2015) Personalized age progression with aging dictionary. In: Proceedings of the IEEE international conference on computer vision, pp 3970–3978

  122. Shu Z, Sahasrabudhe M, Alp Guler R, Samaras D, Paragios N, Kokkinos I (2018) Deforming autoencoders: unsupervised disentangling of shape and appearance. In: Proceedings of the European conference on computer vision (ECCV), pp 650–665

  123. Shu Z, Yumer E, Hadap S, Sunkavalli K, Shechtman E, Samaras D (2017) Neural face editing with intrinsic image disentangling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5541–5550

  124. Sixt L, Wild B, Landgraf T (2018) Rendergan: generating realistic labeled data. Front Robot AI 5:66

    Article  Google Scholar 

  125. Sohn K, Lee H, Yan X (2015) Learning structured output representation using deep conditional generative models. In: Advances in neural information processing systems, pp 3483–3491

  126. Song J, Zhang J, Gao L, Liu X, Shen HT (2018) Dual conditional gans for face aging and rejuvenation. In: IJCAI, pp 899–905

  127. Song L, Lu Z, He R, Sun Z, Tan T (2018) Geometry guided adversarial facial expression synthesis. In: 2018 ACM multimedia conference on multimedia conference. ACM, pp 627–635

  128. Suo J, Zhu SC, Shan S, Chen X (2010) A compositional and dynamic model for face aging. IEEE Trans Pattern Anal Mach Intell 32(3):385–401

    Article  Google Scholar 

  129. Taigman Y, Yang M, Ranzato M, Wolf L (2014) Deepface: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1701–1708

  130. Taylor L, Nitschke G (2018) Improving deep learning using generic data augmentation. In: 2018 IEEE symposium series on computational intelligence (SSCI). IEEE

  131. Thies J, Zollhöfer M, Nießner M, Valgaerts L, Stamminger M, Theobalt C (2015) Real-time expression transfer for facial reenactment. ACM Trans Graph 34(6):183-1

    Article  Google Scholar 

  132. Tian Y, Peng X, Zhao L, Zhang S, Metaxas DN (2018) Cr-gan: learning complete representations for multi-view generation. In: International joint conference on artificial intelligence (IJCAI)

  133. Tran L, Liu X (2018) Nonlinear 3d face morphable model. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7346–7355

  134. Tran L, Yin X, Liu X (2017) Disentangled representation learning gan for pose-invariant face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1415–1424

  135. van Krevelen DWF, Poelman R (2010) A survey of augmented reality technologies, applications and limitations. Int J Virtual Real 9(2):1–20

    Article  Google Scholar 

  136. Van den Oord A, Kalchbrenner N, Espeholt L, Vinyals O, Graves A et al (2016) Conditional image generation with pixelcnn decoders. In: Advances in neural information processing systems, pp 4790–4798

  137. Van Den Oord A, Kalchbrenner N, Kavukcuoglu K (2016) Pixel recurrent neural networks. In: Proceedings of the 33rd international conference on international conference on machine learning-volume 48, ICML’16. JMLR.org, pp 1747–1756

  138. Volpi R, Namkoong H, Sener O, Duchi JC, Murino V, Savarese S (2018) Generalizing to unseen domains via adversarial data augmentation. In: Advances in neural information processing systems, pp 5334–5344

  139. Wang J, Perez L (2017) The effectiveness of data augmentation in image classification using deep learning. Convolutional Neural Netw Vis Recognit 11

  140. Wang W, Cui Z, Yan Y, Feng J, Yan S, Shu X, Sebe N (2016) Recurrent face aging. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2378–2386

  141. Wang Z, Tang X, Luo W, Gao S (2018) Face aging with identity-preserved conditional generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7939–7947

  142. Wiles O, Sophia Koepke A, Zisserman A (2018) X2face: A network for controlling face generation using images, audio, and pose codes. In: Proceedings of the European conference on computer vision (ECCV), pp 670–686

  143. Winston H (2018) Investigating data augmentation strategies for advancing deep learning training. https://winstonhsu.info/wp-content/uploads/2018/03/gtc18-data_aug-180326.pdf

  144. Wu W, Zhang Y, Li C, Qian C, Change Loy C (2018) Reenactgan: learning to reenact faces via boundary transfer. In: Proceedings of the European conference on computer vision (ECCV), pp 603–619

  145. Wu X, He R, Sun Z, Tan T (2018) A light cnn for deep face representation with noisy labels. IEEE Trans Inf Forensics Secur 13(11):2884–2896

    Article  Google Scholar 

  146. Xiao T, Hong J, Ma J (2018) Dna-gan: learning disentangled representations from multi-attribute images. In: International conference on learning representations workshop 2018

  147. Xie W, Shen L, Yang M, Jiang J (2018) Facial expression synthesis with direction field preservation based mesh deformation and lighting fitting based wrinkle mapping. Multimed Tools Appl 77(6):7565–7593

    Article  Google Scholar 

  148. Yan X, Yang J, Sohn K, Lee H (2016) Attribute2image: conditional image generation from visual attributes. In: European conference on computer vision. Springer, pp 776–791

  149. Yang H, Huang D, Wang Y, Jain AK (2018) Learning face age progression: a pyramid architecture of gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 31–39

  150. Yeh R, Liu Z, Goldman DB, Agarwala A (2016) Semantic facial expression editing using autoencoded flow. ArXiv preprint arXiv:1611.09961

  151. Yin X, Yu X, Sohn K, Liu X, Chandraker M (2017) Towards large-pose face frontalization in the wild. In: Proceedings of the IEEE international conference on computer vision, pp 3990–3999

  152. Zhang F, Zhang T, Mao Q, Xu C (2018) Joint pose and expression modeling for facial expression recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3359–3368

  153. Zhang H, Sun Y, Liu L, Wang X, Li L, Liu W (2018) Clothingout: a category-supervised GAN model for clothing segmentation and retrieval. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3691-y

    Article  Google Scholar 

  154. Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, Metaxas DN (2017) Stackgan: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 5907–5915

  155. Zhang L, Samaras D (2006) Face recognition from a single training image under arbitrary unknown lighting using spherical harmonics. IEEE Trans Pattern Anal Mach Intell 28(3):351–363

    Article  Google Scholar 

  156. Zhang Z, Song Y, Qi H (2017) Age progression/regression by conditional adversarial autoencoder. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5810–5818

  157. Zhao J, Cheng Y, Cheng Y, Yang Y, Zhao F, Li J, Liu H, Yan S, Feng J (2019) Look across elapse: disentangled representation learning and photorealistic cross-age face synthesis for age-invariant face recognition. Proc AAAI Conf Artif Intell 33:9251–9258

    Google Scholar 

  158. Zhao J, Cheng Y, Xu Y, Xiong L, Li J, Zhao F, Jayashree K, Pranata S, Shen S, Xing J et al (2018) Towards pose invariant face recognition in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2207–2216

  159. Zhao J, Xiong L, Jayashree PK, Li J, Zhao F, Wang Z, Pranata PS, Shen PS, Yan S, Feng J (2017) Dual-agent gans for photorealistic and identity preserving profile face synthesis. In: Advances in neural information processing systems, pp 66–76

  160. Zhao J, Xiong L, Li J, Xing J, Yan S, Feng J (2018) 3d-aided dual-agent gans for unconstrained face recognition. IEEE Trans Pattern Anal Mach Intell 41(10):2380–2394

    Article  Google Scholar 

  161. Zhou S, Xiao T, Yang Y, Feng D, He Q, He W (2017) Genegan: learning object transfiguration and attribute subspace from unpaired data. In: Proceedings of the British machine vision conference 2017

  162. Zhou Y, Shi BE (2017) Photorealistic facial expression synthesis by the conditional difference adversarial autoencoder. In: 2017 seventh international conference on affective computing and intelligent interaction (ACII). IEEE, pp 370–376

  163. Zhu H, Zhou Q, Zhang J, Wang JZ (2018) Facial aging and rejuvenation by conditional multi-adversarial autoencoder with ordinal regression. ArXiv preprint arXiv:1804.02740

  164. Zhu J.Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232

  165. Zhu X, Lei Z, Liu X, Shi H, Li SZ (2016) Face alignment across large poses: a 3d solution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 146–155

  166. Zhu X, Lei Z, Yan J, Yi D, Li SZ (2015) High-fidelity pose and expression normalization for face recognition in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 787–796

  167. Zhu X, Liu Y, Li J, Wan T, Qin Z (2018) Emotion classification with data augmentation using generative adversarial networks. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 349–360

  168. Zhuang L, Yang AY, Zhou Z, Shankar Sastry S, Ma Y (2013) Single-sample face recognition with image corruption and misalignment via sparse illumination transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3546–3553

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiang Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

A summary of recent works on face data augmentation is illustrated in the following table (Table 3). It summarizes these works from the transformation type, method and the evaluations they performed to test the capability of their algorithms for data augmentation. There is one notable thing that we only label the transformation types explicitly mentioned in original papers. Maybe the methods have the capability to be used for other transformations, but the authors did not mention in their original paper.

Table 3 Summary of recent works

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Wang, K. & Lian, S. A survey on face data augmentation for the training of deep neural networks. Neural Comput & Applic 32, 15503–15531 (2020). https://doi.org/10.1007/s00521-020-04748-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-04748-3

Keywords

Navigation