short-paper

What Image do You Need? A Two-stage Framework for Image Selection in E-commerce

Authors:
Sheng You

Alibaba Group, China

Alibaba Group, China

0000-0001-5289-4171
View Profile

,
Chao Wang

Shanghai University, China

Shanghai University, China

0000-0003-4843-1953
View Profile

,
Baohua Wu

Alibaba Group, China

Alibaba Group, China

0000-0002-3627-7058
View Profile

,
Jingping Liu

East China University of Science and Technology, China

East China University of Science and Technology, China

0000-0002-8671-2302
View Profile

,
Quan Lu

Alibaba Group, China

Alibaba Group, China

0000-0002-4115-3068
View Profile

,
Guanzhou Han

Alibaba Group, China

Alibaba Group, China

0000-0002-3399-9645
View Profile

,
Yanghua Xiao

Fudan University, China

Fudan University, China

0000-0001-8403-9591
View Profile

WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023April 2023Pages 452–456https://doi.org/10.1145/3543873.3584646

Published:30 April 2023Publication History

WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023

Pages 452–456

ABSTRACT

In e-commerce, images are widely used to display more intuitive information about items. Image selection significantly affects the user’s click-through rate (CTR). Most existing work considers the CTR as the target to find an appropriate image. However, these methods are challenging to deploy online efficiently. Also, the selected images may not relate to the item but are profitable to CTR, resulting in the undesirable phenomenon of enticing users to click on the item. To address these issues, we propose a novel two-stage pipeline method with content-based recall model and CTR-based ranking model. The first is realized as a joint method based on the title-image matching model and multi-modal knowledge graph embedding learning model. The second is a CTR-based visually aware scoring model, incorporating the entity textual information and entity images. Experimental results show the effectiveness and efficiency of our method in offline evaluations. After a month of online A/B testing on a travel platform Fliggy, the relative improvement of our method is 5% with respect to seller selection on CTCVR in the searching scenario, and our method further improves pCTR from 3.48% of human pick to 3.53% in the recommendation scenario.

References

Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. 2002. Finite-time Analysis of the Multiarmed Bandit Problem. Mach. Learn. 47, 2-3 (2002), 235–256. https://doi.org/10.1023/A:1013689704352Google ScholarDigital Library
Antoine Bordes, Nicolas Usunier, Alberto García-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States. 2787–2795.Google ScholarDigital Library
Sébastien Bubeck, Rémi Munos, and Gilles Stoltz. 2009. Pure Exploration in Multi-armed Bandits Problems. In Algorithmic Learning Theory, 20th International Conference, ALT 2009, Porto, Portugal, October 3-5, 2009. Proceedings(Lecture Notes in Computer Science, Vol. 5809), Ricard Gavaldà, Gábor Lugosi, Thomas Zeugmann, and Sandra Zilles (Eds.). Springer, 23–37. https://doi.org/10.1007/978-3-642-04414-4_7Google ScholarCross Ref
Jin Chen, Tiezheng Ge, Gangwei Jiang, Zhiqiang Zhang, Defu Lian, and Kai Zheng. 2021. Efficient Optimal Selection for Composited Advertising Creatives with Tree Structure. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021. AAAI Press, 3967–3975. https://ojs.aaai.org/index.php/AAAI/article/view/16516Google ScholarCross Ref
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 4171–4186.Google Scholar
Hossein Talebi Esfandarani and Peyman Milanfar. 2018. NIMA: Neural Image Assessment. IEEE Trans. Image Process. 27, 8 (2018), 3998–4011. https://doi.org/10.1109/TIP.2018.2831899Google ScholarCross Ref
Tom Fawcett. 2006. An introduction to ROC analysis. Pattern Recognit. Lett. 27, 8 (2006), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010Google ScholarDigital Library
Micah Hodosh, Peter Young, and Julia Hockenmaier. 2015. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics (Extended Abstract). In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015. AAAI Press, 4188–4192. http://ijcai.org/Abstract/15/593Google Scholar
Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26-30, 2010, Michael Rappa, Paul Jones, Juliana Freire, and Soumen Chakrabarti (Eds.). ACM, 661–670. https://doi.org/10.1145/1772690.1772758Google ScholarDigital Library
Junyang Lin, Rui Men, An Yang, Chang Zhou, Ming Ding, Yichang Zhang, Peng Wang, Ang Wang, Le Jiang, Xianyan Jia, Jie Zhang, Jianwei Zhang, Xu Zou, Zhikang Li, Xiaodong Deng, Jie Liu, Jinbao Xue, Huiling Zhou, Jianxin Ma, Jin Yu, Yong Li, Wei Lin, Jingren Zhou, Jie Tang, and Hongxia Yang. 2021. M6: A Chinese Multimodal Pretrainer. CoRR abs/2103.00823 (2021). arXiv:2103.00823https://arxiv.org/abs/2103.00823Google Scholar
Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net. https://openreview.net/forum?id=Bkg6RiCqY7Google Scholar
Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai. 2018. Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate(SIGIR ’18). Association for Computing Machinery, New York, NY, USA, 1137–1140. https://doi.org/10.1145/3209978.3210104Google ScholarDigital Library
Shaunak Mishra, Manisha Verma, Yichao Zhou, Kapil Thadani, and Wei Wang. 2020. Learning to Create Better Ads: Generation and Ranking Approaches for Ad Creative Refinement. In CIKM ’20: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, October 19-23, 2020, Mathieu d’Aquin, Stefan Dietze, Claudia Hauff, Edward Curry, and Philippe Cudré-Mauroux (Eds.). ACM, 2653–2660. https://doi.org/10.1145/3340531.3412720Google ScholarDigital Library
Kaixiang Mo, Bo Liu, Lei Xiao, Yong Li, and Jie Jiang. 2015. Image Feature Learning for Cold Start Problem in Display Advertising. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015. AAAI Press, 3728–3734. http://ijcai.org/Abstract/15/524Google Scholar
Daniel Russo and Benjamin Van Roy. 2014. Learning to Optimize via Posterior Sampling. Math. Oper. Res. 39, 4 (2014), 1221–1243. https://doi.org/10.1287/moor.2014.0650Google ScholarDigital Library
Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017. IEEE Computer Society, 618–626. https://doi.org/10.1109/ICCV.2017.74Google ScholarCross Ref
Mingxing Tan and Quoc V. Le. 2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA(Proceedings of Machine Learning Research, Vol. 97). PMLR, 6105–6114. http://proceedings.mlr.press/v97/tan19a.htmlGoogle Scholar
Aäron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. CoRR abs/1807.03748 (2018). arXiv:1807.03748http://arxiv.org/abs/1807.03748Google Scholar
Sreekanth Vempati, Korah T. Malayil, V. Sruthi, and R. Sandeep. 2020. Enabling Hyper-Personalisation: Automated Ad Creative Generation and Ranking for Fashion e-Commerce. In Fashion Recommender Systems, Nima Dokoohaki (Ed.). Springer International Publishing, Cham, 25–48.Google Scholar
Junjie Wang, Yuxiang Zhang, Lin Zhang, Ping Yang, Xinyu Gao, Ziwei Wu, Xiaoqun Dong, Junqing He, Jianheng Zhuo, Qi Yang, Yongfeng Huang, Xiayu Li, Yanghan Wu, Junyu Lu, Xinyu Zhu, Weifeng Chen, Ting Han, Kunhao Pan, Rui Wang, Hao Wang, Xiaojun Wu, Zhongshen Zeng, Chongpei Chen, Ruyi Gan, and Jiaxing Zhang. 2022. Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence. CoRR abs/2209.02970 (2022).Google Scholar
Shiyao Wang, Qi Liu, Tiezheng Ge, Defu Lian, and Zhiqiang Zhang. 2021. A Hybrid Bandit Model with Visual Priors for Creative Ranking in Display Advertising. In WWW ’21: The Web Conference 2021, Virtual Event / Ljubljana, Slovenia, April 19-23, 2021. ACM / IW3C2, 2324–2334. https://doi.org/10.1145/3442381.3449910Google ScholarDigital Library
Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhengyan Zhang, Zhiyuan Liu, Juanzi Li, and Jian Tang. 2021. KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation. Trans. Assoc. Comput. Linguistics 9 (2021), 176–194. https://transacl.org/ojs/index.php/tacl/article/view/2447Google ScholarCross Ref
C. J. C. H. Watkins. 1989. Learning from Delayed Rewards. Ph. D. Dissertation. King’s College, Oxford. To be reprinted by MIT Press..Google Scholar
Zhichen Zhao, Lei Li, Bowen Zhang, Meng Wang, Yuning Jiang, Li Xu, Fengkun Wang, and Wei-Ying Ma. 2019. What You Look Matters?: Offline Evaluation of Advertising Creatives for Cold-start Problem. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM 2019, Beijing, China, November 3-7, 2019. ACM, 2605–2613. https://doi.org/10.1145/3357384.3357813Google ScholarDigital Library
Guorui Zhou, Xiaoqiang Zhu, Chengru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep Interest Network for Click-Through Rate Prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, August 19-23, 2018. ACM, 1059–1068. https://doi.org/10.1145/3219819.3219823Google ScholarDigital Library

Index Terms

What Image do You Need? A Two-stage Framework for Image Selection in E-commerce
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank

Recommendations

To Show or Not Show: Using User Profiling to Manage Internet Advertisement Campaigns at Chitika

We study the problem of an Internet advertising firm that wishes to maximize advertisement (ad) revenue, subject to click-through rate restrictions imposed by the publisher who controls the website on which the ads are displayed. The problem is directly ...
Read More
Online advertisement service pricing and an option contract

For the Internet advertisement market, we consider a contract problem between advertisers and publishers. Among several ways of pricing online advertisements, the methods based on cost-per-impression (CPM) and cost-per-click (CPC) are the two most ...
Read More
Learning Based Image Selection for 3D Reconstruction of Heritage Sites
Pattern Recognition and Machine Intelligence
Abstract
In this paper, we propose learning based pipeline with image clustering and image selection methods for 3D reconstruction of heritage site using cleaned internet sourced images. Cleaned internet sourced images means the images that do not contain ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023
April 2023
1567 pages
ISBN:9781450394192
DOI:10.1145/3543873
Editors:
Ying Ding,
Jie Tang,
Juan Sequeda,
Lora Aroyo,
Carlos Castillo,
Geert-Jan Houben
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 April 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Click-Through Rate
Image Selection
Recall and Rank
Qualifiers
- short-paper
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 202
  Total Downloads
- Downloads (Last 12 months)200
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

What Image do You Need? A Two-stage Framework for Image Selection in E-commerce

WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023

ABSTRACT

References

Cited By

Index Terms

Recommendations

To Show or Not Show: Using User Profiling to Manage Internet Advertisement Campaigns at Chitika

Online advertisement service pricing and an option contract

Learning Based Image Selection for 3D Reconstruction of Heritage Sites

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

What Image do You Need? A Two-stage Framework for Image Selection in E-commerce

WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023

ABSTRACT

References

Cited By

Index Terms

Recommendations

To Show or Not Show: Using User Profiling to Manage Internet Advertisement Campaigns at Chitika

Online advertisement service pricing and an option contract

Learning Based Image Selection for 3D Reconstruction of Heritage Sites

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media