ABSTRACT
Click-through rate prediction is an important task in commercial recommender systems and it aims to predict the probability of a user clicking on an item. The event of a user clicking on an item is accompanied by several user and item features. As modelling the feature interactions effectively can lead to better predictions, it has been the focus of many recent approaches including deep learning-based models. However, the existing approaches either (i) model all possible feature interactions for a given order, or (ii) manually select which feature interactions to model. Besides, they use the same network structure or function to model all the feature interactions while ignoring the difference of complexity among them. To address these issues, we propose a neural architecture search based approach called AutoFeature that automatically finds essential feature interactions and selects an appropriate structure to model each of these interactions. Specifically, we first define a flexible architecture search space for the CTR prediction task which covers many popular designs such as PIN, PNN and DeepFM, etc., and enables higher-order interactions. Then we propose an efficient neural architecture search algorithm that recursively refines the search space by partitioning it into several subspaces and samples from higher quality ones. Extensive experiments on multiple CTR prediction benchmarks show the superiority of our AutoFeature over the state-of-the-art baselines. In addition, our experiments show that the learned architectures use fewer flops/parameters and hence can efficiently incorporate higher-order feature interactions. This further boosts the performance. Finally, we show that AutoFeature can find meaningful feature interactions.
- David J Aldous. 1985. Exchangeability and related topics. In École d'Été de Probabilités de Saint-Flour XIII?1983. Springer, 1--198.Google Scholar
- Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).Google Scholar
- Bowen Baker, Otkrist Gupta, Nikhil Naik, and Ramesh Raskar. 2016. Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167 (2016).Google Scholar
- Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, and Jun Wang. 2018. Efficient architecture search by network transformation. In AAAI.Google Scholar
- Han Cai, Ligeng Zhu, and Song Han. 2019. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware, In ICLR. arXiv preprint arXiv:1812.00332.Google Scholar
- Liang-Chieh Chen, Maxwell Collins, Yukun Zhu, George Papandreou, Barret Zoph, Florian Schroff, Hartwig Adam, and Jon Shlens. 2018. Searching for efficient multi-scale architectures for dense image prediction. In NIPS.Google Scholar
- Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. ACM, 785--794.Google ScholarDigital Library
- Yukang Chen, Tong Yang, Xiangyu Zhang, Gaofeng Meng, Chunhong Pan, and Jian Sun. 2019. Detnas: Neural architecture search on object detection. arXiv preprint arXiv:1903.10979 (2019).Google Scholar
- Hengtze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Deepak Chandra, Hrishi Aradhye, Glen Anderson, Greg S Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & Deep Learning for Recommender Systems. conference on recommender systems (2016), 7--10.Google Scholar
- Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. 249--256.Google Scholar
- Alois Gruson, Praveen Chandar, Christophe Charbuillet, James McInerney, Samantha Hansen, Damien Tardieu, and Ben Carterette. 2019. Offline Evaluation to Make Decisions About PlaylistRecommendation Algorithms. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. ACM, 420--428.Google ScholarDigital Library
- Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19--25, 2017. 1725--1731. https://doi.org/10.24963/ijcai.2017/239Google ScholarCross Ref
- Yuchin Juan, Yong Zhuang, Wei Sheng Chin, and Chih Jen Lin. 2016. Field-aware Factorization Machines for CTR Prediction. In ACM Conference on Recommender Systems. 43--50.Google Scholar
- Kuang-chih Lee, Burkay Orten, Ali Dasdan, and Wentong Li. 2012. Estimating conversion rate in display advertising from past erformance data. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 768--776.Google Scholar
- Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems. arXiv preprint arXiv:1803.05170 (2018).Google Scholar
- Chenxi Liu, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Wei Hua, Alan Yuille, and Li Fei-Fei. 2019. Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation. arXiv preprint arXiv:1901.02985 (2019).Google Scholar
- Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, and Koray Kavukcuoglu. 2017. Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436 (2017).Google Scholar
- Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. Darts: Differentiable architecture search. In ICLR.Google Scholar
- Qiang Liu, Feng Yu, Shu Wu, and Liang Wang. 2015. A convolutional click prediction model. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. ACM, 1743--1746.Google ScholarDigital Library
- H. Brendan McMahan, Gary Holt, David Sculley, Michael Young, Dietmar Ebner, Julian Grady, Lan Nie, Todd Phillips, Eugene Davydov, Daniel Golovin, Sharat Chikkerur, Dan Liu, Martin Wattenberg, Arnar Mar Hrafnkelsson, Tom Boulos, and Jeremy Kubica. 2013. Ad click prediction: a view from the trenches. In The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, August 11--14, 2013. 1222--1230. https://doi.org/10.1145/2487575.2488200Google ScholarDigital Library
- Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Product-based neural networks for user response prediction. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 1149--1154.Google ScholarCross Ref
- Yanru Qu, Bohui Fang, Weinan Zhang, Ruiming Tang, Minzhe Niu, Huifeng Guo, Yong Yu, and Xiuqiang He. 2018. Product-based neural networks for user response prediction over multi-field categorical data. ACM Transactions on Information Systems (TOIS), Vol. 37, 1 (2018), 5.Google ScholarDigital Library
- Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. 2018. Regularized evolution for image classifier architecture search. arXiv preprint arXiv:1802.01548 (2018).Google Scholar
- Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V Le, and Alexey Kurakin. 2017. Large-scale evolution of image classifiers. In ICML.Google Scholar
- Steffen Rendle. 2010. Factorization machines. In Data Mining (ICDM), 2010 IEEE 10th International Conference on. IEEE, 995--1000.Google ScholarDigital Library
- Han Shi, Renjie Pi, Hang Xu, Zhenguo Li, James T. Kwok, and Tong Zhang. 2020. Multi-objective Neural Architecture Search via Predictive Network Performance Optimization. https://openreview.net/forum?id=rJgffkSFPSGoogle Scholar
- Linnan Wang, Yiyang Zhao, Yuu Jinnai, and Rodrigo Fonseca. 2018. Alphax: exploring neural architectures with deep neural networks and monte carlo tree search. arXiv preprint arXiv:1805.07440 (2018).Google Scholar
- Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & cross network for ad click predictions. In Proceedings of the ADKDD'17. ACM, 12.Google ScholarDigital Library
- Jun Xiao, Hao Ye, Xiangnan He, Hanwang Zhang, Fei Wu, and Tat-Seng Chua. 2017. Attentional factorization machines: Learning the weight of feature interactions via attention networks. arXiv preprint arXiv:1708.04617 (2017).Google Scholar
- Sirui Xie, Hehui Zheng, Chunxiao Liu, and Liang Lin. 2019. SNAS: stochastic neural architecture search. In ICLR.Google Scholar
- Weinan Zhang, Tianming Du, and Jun Wang. 2016. Deep learning over multi-field categorical data. In European conference on information retrieval. Springer, 45--57.Google ScholarCross Ref
- Zhao Zhong, Junjie Yan, Wei Wu, Jing Shao, and Cheng-Lin Liu. 2018. Practical block-wise neural network architecture generation. In Proceedings of the CVPR.Google ScholarCross Ref
- Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep interest network for click-through rate prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1059--1068.Google ScholarDigital Library
- Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. In CVPR.Google Scholar
Index Terms
- AutoFeature: Searching for Feature Interactions and Their Architectures for Click-through Rate Prediction
Recommendations
Towards Automated Neural Interaction Discovery for Click-Through Rate Prediction
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningClick-Through Rate (CTR) prediction is one of the most important machine learning tasks in recommender systems, driving personalized experience for billions of consumers. Neural architecture search (NAS), as an emerging field, has demonstrated its ...
Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction
WWW '19: The World Wide Web ConferenceClick-Through Rate prediction is an important task in recommender systems, which aims to estimate the probability of a user to click on a given item. Recently, many deep models have been proposed to learn low-order and high-order feature interactions ...
Satisfaction-Aware User Interest Network for Click-Through Rate Prediction
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementClick-Through Rate (CTR) prediction plays a pivotal role in numerous industrial applications, including online advertising and recommender systems. Existing approaches primarily focus on modeling the correlation between user interests and candidate ...
Comments