skip to main content
10.1145/3534678.3539278acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

Group-wise Reinforcement Feature Generation for Optimal and Explainable Representation Space Reconstruction

Authors Info & Claims
Published:14 August 2022Publication History

ABSTRACT

Representation (feature) space is an environment where data points are vectorized, distances are computed, patterns are characterized, and geometric structures are embedded. Extracting a good representation space is critical to address the curse of dimensionality, improve model generalization, overcome data sparsity, and increase the availability of classic models. Existing literature, such as feature engineering and representation learning, is limited in achieving full automation (e.g., over heavy reliance on intensive labor and empirical experiences), explainable explicitness (e.g., traceable reconstruction process and explainable new features), and flexible optimal (e.g., optimal feature space reconstruction is not embedded into downstream tasks). Can we simultaneously address the automation, explicitness, and optimal challenges in representation space reconstruction for a machine learning task? To answer this question, we propose a group-wise reinforcement generation perspective. We reformulate representation space reconstruction into an interactive process of nested feature generation and selection, where feature generation is to generate new meaningful and explicit features, and feature selection is to eliminate redundant features to control feature sizes. We develop a cascading reinforcement learning method that leverages three cascading Markov Decision Processes to learn optimal generation policies to automate the selection of features and operations and the feature crossing. We design a group-wise generation strategy to cross a feature group, an operation, and another feature group to generate new features and find the strategy that can enhance exploration efficiency and augment reward signals of cascading agents. Finally, we present extensive experiments to demonstrate the effectiveness, efficiency, traceability, and explicitness of our system.

Skip Supplemental Material Section

Supplemental Material

KDD22-rtfp0610.mp4

mp4

32 MB

References

  1. Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35, 8 (2013), 1798--1828.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. DavidMBlei, AndrewY Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. the Journal of machine Learning research 3 (2003), 993--1022.Google ScholarGoogle Scholar
  3. Emmanuel J Candès, Xiaodong Li, Yi Ma, and JohnWright. 2011. Robust principal component analysis? Journal of the ACM (JACM) 58, 3 (2011), 1--37.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Xiangning Chen, Qingwei Lin, Chuan Luo, Xudong Li, Hongyu Zhang, Yong Xu, Yingnong Dang, Kaixin Sui, Xu Zhang, Bo Qiao, et al. 2019. Neural feature search: A neural architecture for automated feature engineering. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 71--80.Google ScholarGoogle ScholarCross RefCross Ref
  5. Yi-Wei Chen, Qingquan Song, and Xia Hu. 2021. Techniques for automated machine learning. ACM SIGKDD Explorations Newsletter 22, 2 (2021), 35--50.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Lin Chih-Jen. 2022. LibSVM Dataset Download. [EB/OL]. https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/.Google ScholarGoogle Scholar
  7. George Forman et al. 2003. An extensive empirical study of feature selection metrics for text classi?cation. J. Mach. Learn. Res. 3, Mar (2003), 1289--1305.Google ScholarGoogle Scholar
  8. Nicolo Fusi, Rishit Sheth, and Melih Elibol. 2018. Probabilistic matrix factorization for automated machine learning. Advances in neural information processing systems 31 (2018), 3348--3357.Google ScholarGoogle Scholar
  9. Palash Goyal and Emilio Ferrara. 2018. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems 151 (2018), 78--94.Google ScholarGoogle ScholarCross RefCross Ref
  10. Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247 (2017).Google ScholarGoogle Scholar
  11. I. Guyon and A. Elissee?. 2003. An introduction to variable and feature selection. The Journal of Machine Learning Research 3 (2003), 1157--1182.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Trevor Hastie, Robert Tibshirani, and Martin Wainwright. 2019. Statistical learning with sparsity: the lasso and generalizations. Chapman and Hall/CRC.Google ScholarGoogle Scholar
  13. Franziska Horn, Robert Pack, and Michael Rieger. 2019. The autofeat python library for automated feature engineering and selection. arXiv preprint arXiv:1901.07329 (2019).Google ScholarGoogle Scholar
  14. Jeremy Howard. 2022. Kaggle Dataset Download. [EB/OL]. https://www.kaggle.com/datasets.Google ScholarGoogle Scholar
  15. Udayan Khurana, Horst Samulowitz, and Deepak Turaga. 2018. Feature engineering for predictive modeling using reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarGoogle ScholarCross RefCross Ref
  16. Ron Kohavi and George H John. 1997. Wrappers for feature subset selection. Artificial intelligence 97, 1--2 (1997), 273--324.Google ScholarGoogle Scholar
  17. Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P Trevino, Jiliang Tang, and Huan Liu. 2017. Feature selection: A data perspective. ACM Computing Surveys (CSUR) 50, 6 (2017), 1--45.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kunpeng Liu, Pengfei Wang, Dongjie Wang, Wan Du, Dapeng Oliver Wu, and Yanjie Fu. 2021. Efficient Reinforced Feature Selection via Early Stopping Traverse Strategy. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 399--408.Google ScholarGoogle ScholarCross RefCross Ref
  19. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).Google ScholarGoogle Scholar
  20. Public. 2022. Openml Dataset Download. [EB/OL]. https://www.openml.org.Google ScholarGoogle Scholar
  21. Public. 2022. UCI Dataset Download. [EB/OL]. https://archive.ics.uci.edu/.Google ScholarGoogle Scholar
  22. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).Google ScholarGoogle Scholar
  23. V Sugumaran, V Muralidharan, and KI Ramachandran. 2007. Feature selection using decision tree and classification through proximal support vector machine for fault diagnostics of roller bearing. Mechanical systems and signal processing 21, 2 (2007), 930--942.Google ScholarGoogle Scholar
  24. Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Richard S Sutton, David A McAllester, Satinder P Singh, and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems. 1057--1063.Google ScholarGoogle Scholar
  26. Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58, 1 (1996), 267--288.Google ScholarGoogle ScholarCross RefCross Ref
  27. Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30.Google ScholarGoogle ScholarCross RefCross Ref
  28. Dongjie Wang, Kunpeng Liu, David Mohaisen, Pengyang Wang, Chang-Tien Lu, and Yanjie Fu. 2021. Automated Feature-Topic Pairing: Aligning Semantic and Embedding Spaces in Spatial Representation Learning. In Proceedings of the 29th International Conference on Advances in Geographic Information Systems. 450--453.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Dongjie Wang, Pengyang Wang, Yanjie Fu, Kunpeng Liu, Hui Xiong, and Charles E Hughes. 2022. Reinforced Imitative Graph Learning for Mobile User Profiling. arXiv preprint arXiv:2203.06550 (2022).Google ScholarGoogle Scholar
  30. DongjieWang, PengyangWang, Kunpeng Liu, Yuanchun Zhou, Charles E Hughes, and Yanjie Fu. 2021. Reinforced Imitative Graph Representation Learning for Mobile User Profiling: An Adversarial Training Perspective. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4410--4417.Google ScholarGoogle Scholar
  31. Xiting Wang, Kunpeng Liu, Dongjie Wang, Le Wu, Yanjie Fu, and Xing Xie. 2022. Multi-level Recommendation Reasoning over Knowledge Graphs with Reinforcement Learning. In Proceedings of the ACM Web Conference 2022. 2098--2108.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Lei Yu and Huan Liu. 2003. Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th international conference on machine learning (ICML-03). 856--863.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Group-wise Reinforcement Feature Generation for Optimal and Explainable Representation Space Reconstruction

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
          August 2022
          5033 pages
          ISBN:9781450393850
          DOI:10.1145/3534678

          Copyright © 2022 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 August 2022

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,133of8,635submissions,13%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader