Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion

Authors:
Nisha Huang

UCAS & Institute of Automation, CAS, Beijing, China

UCAS & Institute of Automation, CAS, Beijing, China
View Profile

,
Fan Tang

Jilin University, Changchun, China

Jilin University, Changchun, China
View Profile

,
Weiming Dong

Institute of Automation, CAS & UCAS, Beijing, China

Institute of Automation, CAS & UCAS, Beijing, China
View Profile

,
Changsheng Xu

Institute of Automation, CAS & UCAS, Beijing, China

Institute of Automation, CAS & UCAS, Beijing, China
View Profile

MM '22: Proceedings of the 30th ACM International Conference on MultimediaOctober 2022Pages 1085–1094https://doi.org/10.1145/3503161.3548282

Published:10 October 2022Publication History

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 1085–1094

ABSTRACT

Digital art synthesis is receiving increasing attention in the multimedia community because of engaging the public with art effectively. Current digital art synthesis methods usually use single-modality inputs as guidance, thereby limiting the expressiveness of the model and the diversity of generated results. To solve this problem, we propose the multimodal guided artwork diffusion (MGAD) model, which is a diffusion-based digital artwork generation approach that utilizes multimodal prompts as guidance to control the classifier-free diffusion model. Additionally, the contrastive language-image pretraining (CLIP) model is used to unify text and image modalities. Extensive experimental results on the quality and quantity of the generated digital art paintings confirm the effectiveness of the combination of the diffusion model and multimodal guidance. Code is available at https://github.com/haha-lisa/MGAD-multimodal-guided-artwork-diffusion.

References

2022. Yahoo Flickr Creative Commons 100 Million (YFCC100m) dataset. http://projects.dfki.uni-kl.de/yfcc100m/Google Scholar
Adverb. 2022. The BigSleep: BigGANCLIP. https://colab.research.google.com/ drive/1NCceX2mbiKOSlAd_o7IU7nA9UskKN5WR?usp=sharing#scrollTo= WtlDVVMvzMUdGoogle Scholar
Luis Alvarez, Nelson Monzón, and Jean-Michel Morel. 2021. Interactive Design of Random Aesthetic Abstract Textures by Composition Principles. Leonardo 54, 2 (2021), 179--184.Google ScholarCross Ref
Andrew Brock, Jeff Donahue, and Karen Simonyan. 2018. Large Scale GAN Training for High Fidelity Natural Image Synthesis. In International Conference on Learning Representations (ICLR).Google Scholar
Johannes Buchner. 2021. ImageHash:An image hashing library written in Python. https://github.com/JohannesBuchner/imagehashGoogle Scholar
Haibo Chen, Lei Zhao, Zhizhong Wang, Huiming Zhang, Zhiwen Zuo, Ailin Li, Wei Xing, and Dongming Lu. 2021. DualAST: Dual Style-Learning Networks for Artistic Style Transfer. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 872--881.Google Scholar
Haibo Chen, Lei Zhao, Huiming Zhang, Zhizhong Wang, Zhiwen Zuo, Ailin Li, Wei Xing, and Dongming Lu. 2021. Diverse Image Style Transfer via Invertible Cross-Space Mapping. In IEEE/CVF International Conference on Computer Vision (ICCV). 14860--14869.Google ScholarCross Ref
Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed El Kholy, Faisal Ahmed, Zhe Gan, Yu Cheng, and Jingjing Liu. 2020. Uniter: Universal image-text representation learning. In European Conference on Computer Vision (ECCV). 104--120.Google ScholarDigital Library
Jooyoung Choi, Sungwon Kim, Yonghyun Jeong, Youngjune Gwon, and Sungroh Yoon. 2021. ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models. In IEEE/CVF International Conference on Computer Vision (ICCV). 14347-- 14356.Google Scholar
Katherine Crowson and Chainbreakers AI. 2022. Diffusion 512x512, secondary model method. https://github.com/crowsonkb/v-diffusion-pytorchGoogle Scholar
Yingying Deng, Fan Tang, Weiming Dong, Haibin Huang, Chongyang Ma, and Changsheng Xu. 2021. Arbitrary video style transfer via multi-channel correlation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 1210--1217.Google ScholarCross Ref
Yingying Deng, Fan Tang, Weiming Dong, Chongyang Ma, Feiyue Huang, Oliver Deussen, and Changsheng Xu. 2020. Exploring the representativity of art paintings. IEEE Transactions on Multimedia 23 (2020), 2794--2805.Google ScholarDigital Library
Yingying Deng, Fan Tang,Weiming Dong, Chongyang Ma, Xingjia Pan, LeiWang, and Changsheng Xu. 2022. StyTr2: Image Style Transfer with Transformers. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Yingying Deng, Fan Tang, Weiming Dong, Wen Sun, Feiyue Huang, and Changsheng Xu. 2020. Arbitrary style transfer via multi-adaptation network. In Proceedings of the 28th ACM International Conference on Multimedia. 2719--2727.Google ScholarDigital Library
Yingying Deng, Fan Tang, Weiming Dong, Fuzhang Wu, Oliver Deussen, and Changsheng Xu. 2019. Selective clustering for representative paintings selection. Multimedia Tools and Applications 78, 14 (2019), 19305--19323.Google ScholarDigital Library
Karan Desai and Justin Johnson. 2021. Virtex: Learning visual representations from textual annotations. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 11162--11173.Google ScholarCross Ref
Prafulla Dhariwal and Alex Nichol. 2021. Diffusion Models Beat GANs on Image Synthesis. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR).Google Scholar
Patrick Esser, Robin Rombach, and Bjorn Ommer. 2021. Taming transformers for high-resolution image synthesis. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 12873--12883.Google ScholarCross Ref
Kevin Frans, LB Soros, and Olaf Witkowski. 2021. CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders. arXiv preprint arXiv:2106.14843 (2021).Google Scholar
Rinon Gal, Or Patashnik, Haggai Maron, Gal Chechik, and Daniel Cohen-Or. 2021. StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators. arXiv:2108.00946 [cs.CV]Google Scholar
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Neural Information Processing Systems (NIPS).Google Scholar
Shuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, Dongdong Chen, Lu Yuan, and Baining Guo. 2021. Vector Quantized Diffusion Model for Text-to-Image Synthesis. arXiv preprint arXiv:2111.14822 (2021).Google Scholar
Peter Hall, Hongping Cai, Qi Wu, and Tadeo Corradi. 2015. Cross-depiction problem: Recognition and synthesis of photographs and artwork. Computational Visual Media 1, 2 (2015), 91--103.Google ScholarCross Ref
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. arXiv: Learning (2020).Google Scholar
Jonathan Ho and Tim Salimans. 2021. Classifier-Free Diffusion Guidance. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications.Google Scholar
Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In IEEE International Conference on Computer Vision (ICCV). IEEE, 1501--1510.Google ScholarCross Ref
Zhengyu Huang, Yichen Peng, Tomohiro Hibino, Chunqi Zhao, Haoran Xie, Tsukasa Fukusato, and Kazunori Miyata. 2022. dualface: Two-stage drawing guidance for freehand portrait sketching. Computational Visual Media 8, 1 (2022), 63--77.Google ScholarCross Ref
Yuchi Huo and Sung-eui Yoon. 2021. A survey on deep learning-based Monte Carlo denoising. Computational Visual Media 7, 2 (2021), 169--185.Google ScholarCross Ref
Ajay Jain. 2021. VectorAscent: Generate vector graphics from a textual description. https://github.com/ajayjain/VectorAscentGoogle Scholar
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for realtime style transfer and super-resolution. In European Conference on Computer Vision (ECCV). 694--711.Google ScholarCross Ref
Alexia Jolicoeur-Martineau, Rémi Piché-Taillefer, Ioannis Mitliagkas, and Remi Tachet des Combes. 2021. Adversarial score matching and improved sampling for image generation. In International Conference on Learning Representations (ICLR).Google Scholar
Tero Karras, Samuli Laine, and Timo Aila. 2018. A Style-Based Generator Architecture for Generative Adversarial Networks. arXiv: Neural and Evolutionary Computing (2018).Google Scholar
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and Improving the Image Quality of StyleGAN. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8107--8116.Google Scholar
Gwanghyun Kim, Taesung Kwon, and Jong Chul Ye. 2021. DiffusionCLIP: Text- Guided Diffusion Models for Robust Image Manipulation. (2021). https://doi.org/10.48550/ARXIV.2110.02711Google Scholar
Diederik P. Kingma and Prafulla Dhariwal. 2018. Glow: Generative Flow with Invertible 1x1 Convolutions. arXiv: Machine Learning (2018).Google Scholar
Dmytro Kotovenko, Artsiom Sanakoyeu, Sabine Lang, and Bjorn Ommer. 2019. Content and Style Disentanglement for Artistic Style Transfer. In IEEE/CVF International Conference on Computer Vision (ICCV). 4422--4431.Google Scholar
Gihyun Kwon and Jong Chul Ye. 2022. CLIPstyler: Image Style Transfer with a Single Text Condition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Tzu-Mao Li, Michal Lukác, Michaël Gharbi, and Jonathan Ragan-Kelley. 2020. Differentiable vector graphics rasterization for editing and learning. ACM Transactions on Graphics 39, 6 (2020), 1--15.Google ScholarDigital Library
Xiujun Li, Xi Yin, Chunyuan Li, Pengchuan Zhang, Xiaowei Hu, Lei Zhang, Lijuan Wang, Houdong Hu, Li Dong, Furu Wei, et al. 2020. Oscar: Objectsemantics aligned pre-training for vision-language tasks. In European Conference on Computer Vision (ECCV). 121--137.Google ScholarDigital Library
Minxuan Lin, Fan Tang,Weiming Dong, Xiao Li, Changsheng Xu, and Chongyang Ma. 2021. Distribution Aligned Multimodal and Multi-Domain Image Stylization. ACM Transactions on Multimedia Computing, Communications, and Applications 17, 3, Article 96 (2021), 17 pages.Google ScholarDigital Library
Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, MeilingWang, Xin Li, Zhengxing Sun, Qian Li, and Errui Ding. 2021. AdaAttN: Revisit attention mechanism in arbitrary neural style transfer. In IEEE/CVF International Conference on Computer Vision (ICCV). 6649--6658.Google ScholarCross Ref
Xingchao Liu, Chengyue Gong, Lemeng Wu, Shujian Zhang, Hao Su, and Qiang Liu. 2021. FuseDream: Training-Free Text-to-Image Generation with Improved CLIPGAN Space Optimization. arXiv:2112.01573 [cs.CV]Google Scholar
Xihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, and Trevor Darrell. 2021. More Control for Free! Image Synthesis with Semantic Diffusion Guidance. arXiv:2112.05744 [cs.CV]Google Scholar
Yahui Liu, Marco De Nadai, Deng Cai, Huayang Li, Xavier Alameda-Pineda, Nicu Sebe, and Bruno Lepri. 2020. Describe What to Change: A Text-Guided Unsupervised Image-to-Image Translation Approach. In Proceedings of the 28th ACM International Conference on Multimedia. Association for Computing Machinery, 1357--1365.Google ScholarDigital Library
Jacob Menick and Nal Kalchbrenner. 2018. Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling. international conference on learning representations (ICLR) (2018).Google Scholar
Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. 2021. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741 (2021).Google Scholar
Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, and Dani Lischinski. 2021. StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2085--2094.Google ScholarCross Ref
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748--8763.Google Scholar
Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125 (2022).Google Scholar
Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-Shot Text-to-Image Generation. In International Conference on Machine Learning (ICML). 8821--8831.Google Scholar
Ali Razavi, Aäron van den Oord, and Oriol Vinyals. 2019. Generating Diverse High-Fidelity Images with VQ-VAE-2. In Advances in Neural Information Processing Systems.Google Scholar
Nerdy Rodent. 2022. Source Code of VQGAN-CLIP. https://github.com/nerdyrodent/VQGAN-CLIPGoogle Scholar
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2021. High-Resolution Image Synthesis with Latent Diffusion Models.Google Scholar
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention.Google Scholar
Shulan Ruan, Yong Zhang, Kun Zhang, Yanbo Fan, Fan Tang, Qi Liu, and Enhong Chen. 2021. DAE-GAN: Dynamic Aspect-aware GAN for Text-to-Image Synthesis. In IEEE/CVF International Conference on Computer Vision (ICCV). 13960--13969.Google Scholar
Peter Schaldenbrand, Zhixuan Liu, and Jean Oh. 2022. StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation. arXiv preprint arXiv:2202.12362 (2022).Google Scholar
Jascha Sohl-Dickstein, Eric L. Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. arXiv: Learning (2015).Google Scholar
Jascha Sohl-Dickstein, Eric L. Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. arXiv: Learning (2015).Google Scholar
Jiaming Song, Chenlin Meng, and Stefano Ermon. 2021. Denoising Diffusion Implicit Models. In International Conference on Learning Representations (ICLR).Google Scholar
Yang Song and Stefano Ermon. 2019. Generative modeling by estimating gradients of the data distribution. Advances in Neural Information Processing Systems 32 (2019).Google Scholar
Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. 2021. Score-Based Generative Modeling through Stochastic Differential Equations. In International Conference on Learning Representations (ICLR).Google Scholar
Wei Ren Tan, Chee Seng Chan, Hernán Aguirre, and Kiyoshi Tanaka. 2017. ArtGAN: Artwork Synthesis with Conditional Categorial GANs.Google Scholar
Wei Ren Tan, Chee Seng Chan, Hernán Aguirre, and Kiyoshi Tanaka. 2019. Improved ArtGAN for Conditional Synthesis of Natural Image and Artwork. IEEE Transactions on Image Processing 28 (2019), 394--409.Google ScholarDigital Library
Fan Tang, Weiming Dong, Yiping Meng, Xing Mei, Feiyue Huang, Xiaopeng Zhang, and Oliver Deussen. 2018. Animated Construction of Chinese Brush Paintings. IEEE Transactions on Visualization and Computer Graphics 24, 12 (2018), 3019--3031.Google ScholarCross Ref
Hao Wang, Guosheng Lin, Steven C. H. Hoi, and Chunyan Miao. 2021. Cycle-Consistent Inverse GAN for Text-to-Image Synthesis. In Proceedings of the 29th ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 630--638.Google ScholarDigital Library
MiaoyiWang, BinWang, Yun Fei, Kanglai Qian,WenpingWang, Jiating Chen, and Jun-Hai Yong. 2014. Towards PhotoWatercolorization with Artistic Verisimilitude. IEEE Transactions on Visualization and Computer Graphics 20, 10 (2014), 1451-- 1460.Google ScholarCross Ref
ZhizhongWang, Lei Zhao, Haibo Chen, Lihong Qiu, Qihang Mo, Sihuan Lin,Wei Xing, and Dongming Lu. 2020. Diversified Arbitrary Style Transfer via Deep Feature Perturbation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7786--7795.Google Scholar
Hua-Peng Wei, Ying-Ying Deng, Fan Tang, Xing-Jia Pan, and Wei-Ming Dong. 2022. A Comparative Study of CNN- and Transformer-Based Visual Style Transfer. Journal of Computer Science and Technology 37, 3 (2022), 601--614.Google ScholarDigital Library
Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, and Xiaodong He. 2018. AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1316--1324.Google ScholarCross Ref
Yuan Xue, Yuan-Chen Guo, Han Zhang, Tao Xu, Song-Hai Zhang, and Xiaolei Huang. 2022. Deep image synthesis from intuitive user input: A review and perspectives. Computational Visual Media 8, 1 (2022), 3--31.Google ScholarCross Ref
Da Yi, Chao Guo, and Tianxiang Bai. 2021. Exploring Painting Synthesis with Diffusion Models. In IEEE 1st International Conference on Digital Twins and Parallel Intelligence (DTPI). 332--335.Google Scholar
Sergey Zagoruyko and Nikos Komodakis. 2016. Wide residual networks. arXiv preprint arXiv:1605.07146 (2016).Google Scholar
Kang Zhang and Jinhui Yu. 2016. Generation of Kandinsky Art. Leonardo 49, 1 (2016), 48--54.Google ScholarCross Ref
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 586--595.Google ScholarCross Ref
Yuxin Zhang, Fan Tang, Weiming Dong, Haibin Huang, Chongyang Ma, Tong- Yee Lee, and Changsheng Xu. 2022. Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning. In Special Interest Group on Computer Graphics and Interactive Techniques Conference Proceedings (SIGGRAPH '22 Conference Proceedings.Google ScholarDigital Library
Bo Zhao, Xiao Wu, Zhi-Qi Cheng, Hao Liu, Zequn Jie, and Jiashi Feng. 2018. Multi-View Image Generation from a Single-View. In Proceedings of the 26th ACM International Conference on Multimedia (Seoul, Republic of Korea). Association for Computing Machinery, New York, NY, USA, 383--391.Google ScholarDigital Library
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 2223-- 2232.Google ScholarCross Ref

Index Terms

Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion
1. Computing methodologies
  1. Computer graphics
    1. Image manipulation

Recommendations

Digital islamic art: the use of digital technologies in contemporary islamic art in the UK
EVA '16: Proceedings of the conference on Electronic Visualisation and the Arts

In this paper, I provide a brief introduction to my artistic practice combining the use of digital technologies with traditional methods for producing Islamic art. Looking at further examples of Islamic artworks by artists in the UK I describe how in ...
Read More
Exploring the Emotional Design of Digital Art Under the Multimodal Interaction Form
Design, User Experience, and Usability
Abstract
Under the trend of digital age, new technologies are changing day by day. The rapid progress of science and technology has led to great progress and innovation in the forms of digital art. Digital art works under multimodality provide not only ...
Read More
Creativity in algorithmic art
C&C '09: Proceedings of the seventh ACM conference on Creativity and cognition

Early algorithmic art (also called computer art or digital art) is chosen as a case to differentiate three aspects of creative behavior: trivial, personal, and historic creativity. Extending a remark by Marcel Duchamp on the role of the spectator in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '22: Proceedings of the 30th ACM International Conference on Multimedia
October 2022
7537 pages
ISBN:9781450392037
DOI:10.1145/3503161
General Chairs:
João Magalhães
NOVA University of Lisbon, Portugal
,
Alberto del Bimbo
University of Florence, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Xavier Alameda-Pineda
Inria, Grenoble, France
,
Qin Jin
Renmin University of China, China
,
Vincent Oria
New Jersey Institute of Technology, USA
,
Laura Toni
University College London, UK
Copyright © 2022 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 October 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
diffusion model
digital art
multimodal guidance
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 1,258
  Total Downloads
- Downloads (Last 12 months)763
- Downloads (Last 6 weeks)47
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Digital islamic art: the use of digital technologies in contemporary islamic art in the UK

Exploring the Emotional Design of Digital Art Under the Multimodal Interaction Form

Creativity in algorithmic art