Skip to main content
Log in

Generative Sketch Healing

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

To perceive and create a whole from parts is a prime trait of the human visual system. In this paper, we teach machines to perform a similar task by recreating a vectorised human sketch from its incomplete parts, dubbed as sketch healing. This is fundamentally different to prior works on image completion since (i) sketches exhibit a severe lack of visual cues and are of a sequential nature, and more importantly (ii) we ask for an agent that does not just fill in a missing part, but to recreate a novel sketch that closely resembles the partial input from scratch. We identify two key facets of sketch healing that are fundamental for effective learning. The first is encoding the incomplete sketches in a graph model that leverages the sequential nature of sketches to associate key visual parts centred around stroke junctions. The intuition is then that message passing within the graph topology will naturally provide the healing power when it comes to missing parts (nodes and edges). Second we show healing is a trade-off process between global semantic preservation and local structure reconstruction, and that it can only be effectively solved when both are taken into account and optimised together. Both qualitative and quantitative results suggest that the proposed method significantly outperforms the state-of-the-art alternatives on sketch healing. Last but not least, we show that sketch healing can be re-purposed to support the interesting application of sketch-based creativity assistant, which aims at generating a novel sketch from two partial sketches even without specifically trained so.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. An earlier and preliminary version of this work was published in (Su et al. , 2020). Compared with (Su et al. , 2020) apart from more extensive experiments and analysis, this work differs in proposing a new healing-specific distance metric from a global perceptual view, which brings significant improvements in model performance. For ease of reading, we denote our framework as SketchHealer throughout, and only distinguish between the two works with a specific version number (1.0 vs. 2.0) when comparison is under way (mostly in the experiment section).

  2. We follow Chen et al. (2017) to remove the KL-divergence term between \(q_{\phi }(z|{\hat{S}})\) and p(z) commonly observed in the formulation of a VAE-like model (Kingma & Welling , 2013), which is shown to benefit multi-class generations.

  3. airplane, alarm_clock, angel, apple, belt, bus, butterfly, cake, cat, clock, eye, fish, pig, sheep, spider, umbrella, The Great Wall.

  4. Suggested optimal parameter settings in the original papers are adopted for all competitors.

References

  • Amir, D., & Weiss, Y. (2021). Understanding and simplifying perceptual distances. In CVPR.

  • Barnes, C., Shechtman, E., Finkelstein, A., & Goldman, DB. (2009) . Patchmatch: A randomized correspondence algorithm for structural image editing. In ACM transactions on graphics (Proc. SIGGRAPH).

  • Berger, I., Shamir, A., Mahler, M., Carter, E., & Hodgins, J. (2013). Style and abstraction in portrait sketching. ACM Transactions on Graphics (TOG), 34, 1–12.

    Google Scholar 

  • Bhunia, A. K., Das, A., Muhammad, U. R., Yang, Y., Hospedales, T. M., Xiang, T., Gryaditskaya, Y., & Song, Y. Z. (2020a). Pixelor: A competitive sketching AI agent. so you think you can sketch? In ACM transactions on graphics (Proc. SIGGRAPH Asia).

  • Bhunia, A. K., Yang, Y., Hospedales, T. M., Xiang, T., & Song, Y. Z. (2020b). Sketch less for more: On-the-fly fine-grained sketch-based image retrieval. In CVPR.

  • Blau, Y., & Michaeli, T. (2019). Rethinking lossy compression: The rate-distortion-perception tradeoff. In ICML.

  • Cao, N., Yan, X., Shi, Y., & Chen, C. (2019). Ai-sketcher: A deep generative model for producing high-quality sketches. In AAAI.

  • Cao, Y., Wang, H., Wang, C., Li, Z., Zhang, L., & Zhang, L. (2010). Mindfinder: Interactive sketch-based image search on millions of images. In ACM MM.

  • Chan, C., Ginosar, S., Zhou, T., & Efros, A. A. (2019). Everybody dance now. In ICCV.

  • Chen, Q., & Koltun, V. (2017). Photographic image synthesis with cascaded refinement networks. In ICCV.

  • Chen, Y., Tu, S., Yi, Y., & Xu, L. (2017) . Sketch-pix2seq: A model to generate sketches of multiple categories. arXiv:1709.04121

  • Chen, Z. M., Wei, X. S., Wang, P., & Guo, Y. (2019). Multi-label image recognition with graph convolutional networks. In CVPR.

  • Das, A., Yang, Y., Hospedales, T., Xiang, T., & Song, Y. Z. (2020). Béziersketch: A generative model for scalable vector sketches. In ECCV.

  • Efros, AA., & Leung, TK. (1999) . Texture synthesis by non-parametric sampling. In: ICCV

  • Eitz, M., Hays, J., & Alexa, M. (2012). How do humans sketch objects? In ACM Transactions on graphics (Proc. SIGGRAPH).

  • Ge, S., Goswami, V., Zitnick, C. L., & Parikh, D. (2020). Creative sketch generation. In ICLR.

  • Gori, M., Monfardini, G., & Scarselli, F. (2005). A new model for learning in graph domains. In IJCNN.

  • Graves, A. (2013). Generating sequences with recurrent neural networks. arXiv:1308.0850

  • Ha, D., & Eck, D. (2018). A neural representation of sketch drawings. In ICLR.

  • Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., & Keutzer, K. (2017). Squeezenet: Alexnet-level accuracy with 50\(\times \) fewer parameters and \(<\)0.5 mb model size. In ICLR.

  • Iizuka, S., Simo-Serra, E., & Ishikawa, H. (2017). Globally and locally consistent image completion. In ACM transactions on graphics (Proc. SIGGRAPH).

  • Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In CVPR.

  • Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In ECCV.

  • Johnson, J., Gupta, A., & Fei-Fei, L. (2018). Image generation from scene graphs. In CVPR.

  • Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2018). Progressive growing of gans for improved quality, stability, and variation. In ICLR.

  • Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In CVPR.

  • Kim, T., Cha, M., Kim, H., Lee, J. K., & Kim, J . (2017). Learning to discover cross-domain relations with generative adversarial networks. In ICML.

  • Kingma, D. P., & Welling, M. (2013). Auto-encoding variational Bayes. arXiv:1312.6114

  • Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. In ICLR.

  • Lahiri, A., Jain, A. K., Agrawal, S., Mitra, P., & Biswas, P. K. (2020). Prior guided gan based semantic inpainting. In CVPR.

  • Li, K., Pang, K., Song, J., Song, Y. Z., Xiang, T., Hospedales, T. M., & Zhang, H. (2018). Universal sketch perceptual grouping. In ECCV.

  • Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., & Yang, M. H. (2017). Universal style transfer via feature transforms. In NIPS.

  • Liu, F., Deng, X., Lai, Y. K., Liu, Y. J., Ma, C., & Wang, H. (2019). Sketchgan: Joint sketch completion and recognition with generative adversarial network. In CVPR.

  • Liu, G., Reda, F. A., Shih, K. J., Wang, T. C., Tao, A., & Catanzaro, B. (2018). Image inpainting for irregular holes using partial convolutions. In ECCV.

  • Miyato, T., Kataoka, T., Koyama, M., & Yoshida, Y. (2018). Spectral normalization for generative adversarial networks. In ICLR.

  • Pang, K., Li, D., Song, J., Song, Y. Z., Xiang, T., & Hospedales, T. M. (2018). Deep factorised inverse-sketching. In ECCV.

  • Pang, K., Li, K., Yang, Y., Zhang, H., Hospedales, T. M., Xiang, T., & Song, Y. Z. (2019). Generalising fine-grained sketch-based image retrieval. In CVPR.

  • Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in pytorch.

  • Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., & Efros, A. A. (2016). Context encoders: Feature learning by inpainting. In CVPR.

  • Pumarola, A., Agudo, A., Martinez, A. M., Sanfeliu, A., & Moreno-Noguer, F. (2018). Ganimation: Anatomically-aware facial animation from a single image. In ECCV.

  • Qi, Y., Su, G., Chowdhury, P. N., Li, M., & Song, Y. Z. (2021). Sketchlattice: Latticed representation for sketch manipulation. In ICCV.

  • Ranjan, A., Bolkart, T., Sanyal, S., & Black, M. J. (2018). Generating 3d faces using convolutional mesh autoencoders. In ECCV.

  • Riaz Muhammad, U., Yang, Y., Song, Y. Z., Xiang, T., & Hospedales, T. M. (2018). Learning deep sketch abstraction. In CVPR.

  • Sagong, M., Shin, Y., Kim, S., Park, S., & Ko, S. (2019). Pepsi: Fast image inpainting with parallel decoding network. In CVPR.

  • Sangkloy, P., Burnell, N., Ham, C., & Hays, J. (2016). The sketchy database: learning to retrieve badly drawn bunnies. In ACM transactions on graphics (Proc. SIGGRAPH).

  • Sangkloy, P., Lu, J., Fang, C., Yu, F., & Hays, J. (2017). Scribbler: Controlling deep image synthesis with sketch and color. In CVPR.

  • Shen, Y., Liu, L., Shen, F., & Shao, L. (2018). Zero-shot sketch-image hashing. In CVPR.

  • Shih, M. L., Su, S. Y., Kopf, J., & Huang, J. B. (2020). 3d photography using context-aware layered depth inpainting. In CVPR.

  • Simo-Serra, E., Iizuka, S., & Ishikawa, H. (2018). Mastering sketching: Adversarial augmentation for structured prediction. ACM Transactions on Graphics (TOG), 37, 1–3.

    Google Scholar 

  • Song, J., Pang, K., Song, Y. Z., Xiang, T., & Hospedales, T. M. (2018a). Learning to sketch with shortcut cycle consistency. In CVPR.

  • Song, Y., Yang, C., Lin, Z., Liu, X., Huang, Q., Li, H., & Jay Kuo, C. C. (2018b). Contextual-based image inpainting: Infer, match, and translate. In ECCV.

  • Song, Y., Yang, C., Shen, Y., Wang, P., Huang, Q., & Kuo, C. C. J. (2018c). Spg-net: Segmentation prediction and guidance network for image inpainting. In BMVC.

  • Su, G., Qi, Y., Pang, K., Yang, J., Song, Y. Z., & SketchX, C. (2020). Sketchhealer: A graph-to-sequence network for recreating partial human sketches. In BMVC.

  • Tallon, C., Bertrand, O., Bouchet, P., & Pernier, J. (1995). Gamma-range activity evoked by coherent visual stimuli in humans. European Journal of Neuroscience, 7, 1285–1291.

    Article  Google Scholar 

  • Tariq, T., Tursun, O. T., Kim, M., & Didyk, P. (2020). Why are deep representations good perceptual quality features? In ECCV.

  • Theis, L., Oord, A., & Bethge, M. (2016). A note on the evaluation of generative models. In ICLR.

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. NeurIPS.

  • Wang, A., Ren, M., & Zemel, R. (2021). Sketchembednet: Learning novel concepts by imitating drawings. In ICML.

  • Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., & Jiang, Y. G. (2018a). Pixel2mesh: Generating 3d mesh models from single RGB images. In ECCV.

  • Wang, T. C., Liu, M. Y., Zhu, J. Y., Tao, A., Kautz, J., & Catanzaro, B. (2018b). High-resolution image synthesis and semantic manipulation with conditional gans. In CVPR.

  • Wilczkowiak, M., Brostow, G. J., Tordoff, B., & Cipolla, R. (2005). Hole filling through photomontage. In BMVC.

  • Xiong, W., Yu, J., Lin, Z., Yang, J., Lu, X., Barnes, C., & Luo, J. (2019). Foreground-aware image inpainting. In CVPR.

  • Xu, P., Joshi, C. K., & Bresson, X.(2019). Multi-graph transformer for free-hand sketch recognition. TNNLS.

  • Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., & Li, H. (2017). High-resolution image inpainting using multi-scale neural patch synthesis. In CVPR

  • Yang, J., Lu, J., Lee, S., Batra, D., & Parikh, D. (2018). Graph R-CNN for scene graph generation. In ECCV.

  • Yang, L., Pang, K., Zhang, H., & Song, Y. Z. (2021a). Sketchaa: Abstract representation for abstract sketches. In ICCV.

  • Yang, L., Zhuang, J., Fu, H., Wei, X., Zhou, K., & Zheng, Y. (2021). Sketchgnn: Semantic sketch segmentation with graph neural networks. ACM Transactions on Graphics (TOG), 40, 1–13.

    Google Scholar 

  • Yao, T., Pan, Y., Li, Y., & Mei, T. (2018). Exploring visual relationship for image captioning. In ECCV.

  • Yi, Z., Zhang, H., Tan, P., & Gong, M. (2017). Dualgan: Unsupervised dual learning for image-to-image translation. In ICCV.

  • Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., & Huang, T. S. (2018). Generative image inpainting with contextual attention. In CVPR.

  • Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., & Huang, T. S. (2019). Free-form image inpainting with gated convolution. In ICCV.

  • Yu, Q., Yang, Y., Liu, F., Song, Y. Z., Xiang, T., & Hospedales, T. M. (2017). Sketch-a-net: A deep neural network that beats humans. IJCV.

  • Zang, S., Tu, S., & Xu, L. (2021). Controllable stroke-based sketch synthesis from a self-organized latent space. Neural Networks, 137, 138–150.

    Article  Google Scholar 

  • Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., & Metaxas, D. N. (2017a). Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In ICCV.

  • Zhang, R., Zhu, J. Y., Isola, P., Geng, X., Lin, A. S., Yu, T., & Efros, A. A. (2017b). Real-time user-guided image colorization with learned deep priors. In ACM Transactions on Graphics (Proc. SIGGRAPH).

  • Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. In CVPR.

  • Zheng, C., Cham, T. J., & Cai, J. (2019). Pluralistic image completion. In CVPR.

  • Zhou, H., Liu, Y., Liu, Z., Luo, P., & Wang, X. (2019). Talking face generation by adversarially disentangled audio-visual representation. In AAAI.

  • Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017a). Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV.

  • Zhu, J. Y., Zhang, R., Pathak, D., Darrell, T., Efros, A. A., Wang, O., & Shechtman, E. (2017b). Toward multimodal image-to-image translation. In NIPS.

Download references

Funding

Funding was provided by the National Natural Science Foundation of China (Grant No. 61601042).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yonggang Qi.

Additional information

Communicated by William Smith.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qi, Y., Su, G., Wang, Q. et al. Generative Sketch Healing. Int J Comput Vis 130, 2006–2021 (2022). https://doi.org/10.1007/s11263-022-01623-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-022-01623-7

Keywords

Navigation