Generative Sketch Healing

Qi, Yonggang; Su, Guoyao; Wang, Qiang; Yang, Jie; Pang, Kaiyue; Song, Yi-Zhe

doi:10.1007/s11263-022-01623-7

Generative Sketch Healing

Published: 07 June 2022

Volume 130, pages 2006–2021, (2022)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Yonggang Qi ORCID: orcid.org/0000-0001-8280-3541^1,2,
Guoyao Su¹,
Qiang Wang¹,
Jie Yang¹,
Kaiyue Pang² &
…
Yi-Zhe Song²

973 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

To perceive and create a whole from parts is a prime trait of the human visual system. In this paper, we teach machines to perform a similar task by recreating a vectorised human sketch from its incomplete parts, dubbed as sketch healing. This is fundamentally different to prior works on image completion since (i) sketches exhibit a severe lack of visual cues and are of a sequential nature, and more importantly (ii) we ask for an agent that does not just fill in a missing part, but to recreate a novel sketch that closely resembles the partial input from scratch. We identify two key facets of sketch healing that are fundamental for effective learning. The first is encoding the incomplete sketches in a graph model that leverages the sequential nature of sketches to associate key visual parts centred around stroke junctions. The intuition is then that message passing within the graph topology will naturally provide the healing power when it comes to missing parts (nodes and edges). Second we show healing is a trade-off process between global semantic preservation and local structure reconstruction, and that it can only be effectively solved when both are taken into account and optimised together. Both qualitative and quantitative results suggest that the proposed method significantly outperforms the state-of-the-art alternatives on sketch healing. Last but not least, we show that sketch healing can be re-purposed to support the interesting application of sketch-based creativity assistant, which aims at generating a novel sketch from two partial sketches even without specifically trained so.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DoodleFormer: Creative Sketch Drawing with Transformers

Unsupervised Sketch to Photo Synthesis

VujaDessin: A Sketch Learning Support System Using a Blurred Motif Object

Notes

An earlier and preliminary version of this work was published in (Su et al. , 2020). Compared with (Su et al. , 2020) apart from more extensive experiments and analysis, this work differs in proposing a new healing-specific distance metric from a global perceptual view, which brings significant improvements in model performance. For ease of reading, we denote our framework as SketchHealer throughout, and only distinguish between the two works with a specific version number (1.0 vs. 2.0) when comparison is under way (mostly in the experiment section).
We follow Chen et al. (2017) to remove the KL-divergence term between \(q_{\phi }(z|{\hat{S}})\) and p(z) commonly observed in the formulation of a VAE-like model (Kingma & Welling , 2013), which is shown to benefit multi-class generations.
airplane, alarm_clock, angel, apple, belt, bus, butterfly, cake, cat, clock, eye, fish, pig, sheep, spider, umbrella, The Great Wall.
Suggested optimal parameter settings in the original papers are adopted for all competitors.

References

Amir, D., & Weiss, Y. (2021). Understanding and simplifying perceptual distances. In CVPR.
Barnes, C., Shechtman, E., Finkelstein, A., & Goldman, DB. (2009) . Patchmatch: A randomized correspondence algorithm for structural image editing. In ACM transactions on graphics (Proc. SIGGRAPH).
Berger, I., Shamir, A., Mahler, M., Carter, E., & Hodgins, J. (2013). Style and abstraction in portrait sketching. ACM Transactions on Graphics (TOG), 34, 1–12.
Google Scholar
Bhunia, A. K., Das, A., Muhammad, U. R., Yang, Y., Hospedales, T. M., Xiang, T., Gryaditskaya, Y., & Song, Y. Z. (2020a). Pixelor: A competitive sketching AI agent. so you think you can sketch? In ACM transactions on graphics (Proc. SIGGRAPH Asia).
Bhunia, A. K., Yang, Y., Hospedales, T. M., Xiang, T., & Song, Y. Z. (2020b). Sketch less for more: On-the-fly fine-grained sketch-based image retrieval. In CVPR.
Blau, Y., & Michaeli, T. (2019). Rethinking lossy compression: The rate-distortion-perception tradeoff. In ICML.
Cao, N., Yan, X., Shi, Y., & Chen, C. (2019). Ai-sketcher: A deep generative model for producing high-quality sketches. In AAAI.
Cao, Y., Wang, H., Wang, C., Li, Z., Zhang, L., & Zhang, L. (2010). Mindfinder: Interactive sketch-based image search on millions of images. In ACM MM.
Chan, C., Ginosar, S., Zhou, T., & Efros, A. A. (2019). Everybody dance now. In ICCV.
Chen, Q., & Koltun, V. (2017). Photographic image synthesis with cascaded refinement networks. In ICCV.
Chen, Y., Tu, S., Yi, Y., & Xu, L. (2017) . Sketch-pix2seq: A model to generate sketches of multiple categories. arXiv:1709.04121
Chen, Z. M., Wei, X. S., Wang, P., & Guo, Y. (2019). Multi-label image recognition with graph convolutional networks. In CVPR.
Das, A., Yang, Y., Hospedales, T., Xiang, T., & Song, Y. Z. (2020). Béziersketch: A generative model for scalable vector sketches. In ECCV.
Efros, AA., & Leung, TK. (1999) . Texture synthesis by non-parametric sampling. In: ICCV
Eitz, M., Hays, J., & Alexa, M. (2012). How do humans sketch objects? In ACM Transactions on graphics (Proc. SIGGRAPH).
Ge, S., Goswami, V., Zitnick, C. L., & Parikh, D. (2020). Creative sketch generation. In ICLR.
Gori, M., Monfardini, G., & Scarselli, F. (2005). A new model for learning in graph domains. In IJCNN.
Graves, A. (2013). Generating sequences with recurrent neural networks. arXiv:1308.0850
Ha, D., & Eck, D. (2018). A neural representation of sketch drawings. In ICLR.
Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., & Keutzer, K. (2017). Squeezenet: Alexnet-level accuracy with 50\(\times \) fewer parameters and \(<\)0.5 mb model size. In ICLR.
Iizuka, S., Simo-Serra, E., & Ishikawa, H. (2017). Globally and locally consistent image completion. In ACM transactions on graphics (Proc. SIGGRAPH).
Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In CVPR.
Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In ECCV.
Johnson, J., Gupta, A., & Fei-Fei, L. (2018). Image generation from scene graphs. In CVPR.
Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2018). Progressive growing of gans for improved quality, stability, and variation. In ICLR.
Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In CVPR.
Kim, T., Cha, M., Kim, H., Lee, J. K., & Kim, J . (2017). Learning to discover cross-domain relations with generative adversarial networks. In ICML.
Kingma, D. P., & Welling, M. (2013). Auto-encoding variational Bayes. arXiv:1312.6114
Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. In ICLR.
Lahiri, A., Jain, A. K., Agrawal, S., Mitra, P., & Biswas, P. K. (2020). Prior guided gan based semantic inpainting. In CVPR.
Li, K., Pang, K., Song, J., Song, Y. Z., Xiang, T., Hospedales, T. M., & Zhang, H. (2018). Universal sketch perceptual grouping. In ECCV.
Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., & Yang, M. H. (2017). Universal style transfer via feature transforms. In NIPS.
Liu, F., Deng, X., Lai, Y. K., Liu, Y. J., Ma, C., & Wang, H. (2019). Sketchgan: Joint sketch completion and recognition with generative adversarial network. In CVPR.
Liu, G., Reda, F. A., Shih, K. J., Wang, T. C., Tao, A., & Catanzaro, B. (2018). Image inpainting for irregular holes using partial convolutions. In ECCV.
Miyato, T., Kataoka, T., Koyama, M., & Yoshida, Y. (2018). Spectral normalization for generative adversarial networks. In ICLR.
Pang, K., Li, D., Song, J., Song, Y. Z., Xiang, T., & Hospedales, T. M. (2018). Deep factorised inverse-sketching. In ECCV.
Pang, K., Li, K., Yang, Y., Zhang, H., Hospedales, T. M., Xiang, T., & Song, Y. Z. (2019). Generalising fine-grained sketch-based image retrieval. In CVPR.
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in pytorch.
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., & Efros, A. A. (2016). Context encoders: Feature learning by inpainting. In CVPR.
Pumarola, A., Agudo, A., Martinez, A. M., Sanfeliu, A., & Moreno-Noguer, F. (2018). Ganimation: Anatomically-aware facial animation from a single image. In ECCV.
Qi, Y., Su, G., Chowdhury, P. N., Li, M., & Song, Y. Z. (2021). Sketchlattice: Latticed representation for sketch manipulation. In ICCV.
Ranjan, A., Bolkart, T., Sanyal, S., & Black, M. J. (2018). Generating 3d faces using convolutional mesh autoencoders. In ECCV.
Riaz Muhammad, U., Yang, Y., Song, Y. Z., Xiang, T., & Hospedales, T. M. (2018). Learning deep sketch abstraction. In CVPR.
Sagong, M., Shin, Y., Kim, S., Park, S., & Ko, S. (2019). Pepsi: Fast image inpainting with parallel decoding network. In CVPR.
Sangkloy, P., Burnell, N., Ham, C., & Hays, J. (2016). The sketchy database: learning to retrieve badly drawn bunnies. In ACM transactions on graphics (Proc. SIGGRAPH).
Sangkloy, P., Lu, J., Fang, C., Yu, F., & Hays, J. (2017). Scribbler: Controlling deep image synthesis with sketch and color. In CVPR.
Shen, Y., Liu, L., Shen, F., & Shao, L. (2018). Zero-shot sketch-image hashing. In CVPR.
Shih, M. L., Su, S. Y., Kopf, J., & Huang, J. B. (2020). 3d photography using context-aware layered depth inpainting. In CVPR.
Simo-Serra, E., Iizuka, S., & Ishikawa, H. (2018). Mastering sketching: Adversarial augmentation for structured prediction. ACM Transactions on Graphics (TOG), 37, 1–3.
Google Scholar
Song, J., Pang, K., Song, Y. Z., Xiang, T., & Hospedales, T. M. (2018a). Learning to sketch with shortcut cycle consistency. In CVPR.
Song, Y., Yang, C., Lin, Z., Liu, X., Huang, Q., Li, H., & Jay Kuo, C. C. (2018b). Contextual-based image inpainting: Infer, match, and translate. In ECCV.
Song, Y., Yang, C., Shen, Y., Wang, P., Huang, Q., & Kuo, C. C. J. (2018c). Spg-net: Segmentation prediction and guidance network for image inpainting. In BMVC.
Su, G., Qi, Y., Pang, K., Yang, J., Song, Y. Z., & SketchX, C. (2020). Sketchhealer: A graph-to-sequence network for recreating partial human sketches. In BMVC.
Tallon, C., Bertrand, O., Bouchet, P., & Pernier, J. (1995). Gamma-range activity evoked by coherent visual stimuli in humans. European Journal of Neuroscience, 7, 1285–1291.
Article Google Scholar
Tariq, T., Tursun, O. T., Kim, M., & Didyk, P. (2020). Why are deep representations good perceptual quality features? In ECCV.
Theis, L., Oord, A., & Bethge, M. (2016). A note on the evaluation of generative models. In ICLR.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. NeurIPS.
Wang, A., Ren, M., & Zemel, R. (2021). Sketchembednet: Learning novel concepts by imitating drawings. In ICML.
Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., & Jiang, Y. G. (2018a). Pixel2mesh: Generating 3d mesh models from single RGB images. In ECCV.
Wang, T. C., Liu, M. Y., Zhu, J. Y., Tao, A., Kautz, J., & Catanzaro, B. (2018b). High-resolution image synthesis and semantic manipulation with conditional gans. In CVPR.
Wilczkowiak, M., Brostow, G. J., Tordoff, B., & Cipolla, R. (2005). Hole filling through photomontage. In BMVC.
Xiong, W., Yu, J., Lin, Z., Yang, J., Lu, X., Barnes, C., & Luo, J. (2019). Foreground-aware image inpainting. In CVPR.
Xu, P., Joshi, C. K., & Bresson, X.(2019). Multi-graph transformer for free-hand sketch recognition. TNNLS.
Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., & Li, H. (2017). High-resolution image inpainting using multi-scale neural patch synthesis. In CVPR
Yang, J., Lu, J., Lee, S., Batra, D., & Parikh, D. (2018). Graph R-CNN for scene graph generation. In ECCV.
Yang, L., Pang, K., Zhang, H., & Song, Y. Z. (2021a). Sketchaa: Abstract representation for abstract sketches. In ICCV.
Yang, L., Zhuang, J., Fu, H., Wei, X., Zhou, K., & Zheng, Y. (2021). Sketchgnn: Semantic sketch segmentation with graph neural networks. ACM Transactions on Graphics (TOG), 40, 1–13.
Google Scholar
Yao, T., Pan, Y., Li, Y., & Mei, T. (2018). Exploring visual relationship for image captioning. In ECCV.
Yi, Z., Zhang, H., Tan, P., & Gong, M. (2017). Dualgan: Unsupervised dual learning for image-to-image translation. In ICCV.
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., & Huang, T. S. (2018). Generative image inpainting with contextual attention. In CVPR.
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., & Huang, T. S. (2019). Free-form image inpainting with gated convolution. In ICCV.
Yu, Q., Yang, Y., Liu, F., Song, Y. Z., Xiang, T., & Hospedales, T. M. (2017). Sketch-a-net: A deep neural network that beats humans. IJCV.
Zang, S., Tu, S., & Xu, L. (2021). Controllable stroke-based sketch synthesis from a self-organized latent space. Neural Networks, 137, 138–150.
Article Google Scholar
Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., & Metaxas, D. N. (2017a). Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In ICCV.
Zhang, R., Zhu, J. Y., Isola, P., Geng, X., Lin, A. S., Yu, T., & Efros, A. A. (2017b). Real-time user-guided image colorization with learned deep priors. In ACM Transactions on Graphics (Proc. SIGGRAPH).
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. In CVPR.
Zheng, C., Cham, T. J., & Cai, J. (2019). Pluralistic image completion. In CVPR.
Zhou, H., Liu, Y., Liu, Z., Luo, P., & Wang, X. (2019). Talking face generation by adversarially disentangled audio-visual representation. In AAAI.
Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017a). Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV.
Zhu, J. Y., Zhang, R., Pathak, D., Darrell, T., Efros, A. A., Wang, O., & Shechtman, E. (2017b). Toward multimodal image-to-image translation. In NIPS.

Download references

Funding

Funding was provided by the National Natural Science Foundation of China (Grant No. 61601042).

Author information

Authors and Affiliations

School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China
Yonggang Qi, Guoyao Su, Qiang Wang & Jie Yang
SketchX Lab, CVSSP, University of Surrey, Guildford, UK
Yonggang Qi, Kaiyue Pang & Yi-Zhe Song

Authors

Yonggang Qi
View author publications
You can also search for this author in PubMed Google Scholar
Guoyao Su
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Yang
View author publications
You can also search for this author in PubMed Google Scholar
Kaiyue Pang
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Zhe Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yonggang Qi.

Additional information

Communicated by William Smith.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qi, Y., Su, G., Wang, Q. et al. Generative Sketch Healing. Int J Comput Vis 130, 2006–2021 (2022). https://doi.org/10.1007/s11263-022-01623-7

Download citation

Received: 20 September 2021
Accepted: 27 April 2022
Published: 07 June 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s11263-022-01623-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generative Sketch Healing

Abstract

Access this article

Similar content being viewed by others

DoodleFormer: Creative Sketch Drawing with Transformers

Unsupervised Sketch to Photo Synthesis

VujaDessin: A Sketch Learning Support System Using a Blurred Motif Object

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Generative Sketch Healing

Abstract

Access this article

Similar content being viewed by others

DoodleFormer: Creative Sketch Drawing with Transformers

Unsupervised Sketch to Photo Synthesis

VujaDessin: A Sketch Learning Support System Using a Blurred Motif Object

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation