ABSTRACT
Molecular representation learning has attracted much attention recently. A molecule can be viewed as a 2D graph with nodes/atoms connected by edges/bonds, and can also be represented by a 3D conformation with 3-dimensional coordinates of all atoms. We note that most previous work handles 2D and 3D information separately, while jointly leveraging these two sources may foster a more informative representation. In this work, we explore this appealing idea and propose a new representation learning method based on a unified 2D and 3D pre-training. Atom coordinates and interatomic distances are encoded and then fused with atomic representations through graph neural networks. The model is pre-trained on three tasks: reconstruction of masked atoms and coordinates, 3D conformation generation conditioned on 2D graph, and 2D graph generation conditioned on 3D conformation. We evaluate our method on 11 downstream molecular property prediction tasks: 7 with 2D information only and 4 with both 2D and 3D information. Our method achieves state-of-the-art results on 10 tasks, and the average improvement on 2D-only tasks is 8.3%. Our method also achieves significant improvement on two 3D conformation generation tasks.
Supplemental Material
- Ravichandra Addanki, Peter W Battaglia, David Budden, Andreea Deac, Jonathan Godwin, Thomas Keck, Wai Lok Sibon Li, Alvaro Sanchez-Gonzalez, Jacklynn Stott, Shantanu Thakoor, et al. 2021. Large-scale graph representation learning with very deep GNNs and self-supervision. arXiv:2107.09422 (2021).Google Scholar
- Brandon Anderson, Truong Son Hy, and Risi Kondor. 2019Cormorant: Covariant molecular neural networks. NeurIPS, Vol. 32 (2019).Google Scholar
- Simon Axelrod and Rafael Gomez-Bombarelli. 2020. Geom: Energy-annotated molecular conformations for property prediction and molecular generation. arXiv preprint arXiv:2006.05531 (2020).Google Scholar
- Dong Chen, Kaifu Gao, Duc Duy Nguyen, Xin Chen, Yi Jiang, Guo-Wei Wei, and Feng Pan. 2021. Algebraic graph-assisted bidirectional transformers for molecular property prediction. Nature Communications, Vol. 12, 1 (2021), 1--9.Google Scholar
- Seyone Chithrananda, Gabriel Grand, and Bharath Ramsundar. 2020. Chemberta: Large-scale self-supervised pretraining for molecular property prediction. arXiv:2010.09885 (2020).Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL. 4171--4186.Google Scholar
- Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In ICLR.Google Scholar
- Joseph L. Durant, Burton A. Leland, Douglas R. Henry, and James G. Nourse. 2002. Reoptimization of MDL Keys for Use in Drug Discovery. Journal of Chemical Information and Computer Sciences, Vol. 42, 6 (2002), 1273--1280.Google ScholarCross Ref
- Fabian Fuchs, Daniel Worrall, Volker Fischer, and Max Welling. [n.,d.]. SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks. In NeurIPS.Google Scholar
- Kaifu Gao, Duc Duy Nguyen, Vishnu Sresht, Alan M. Mathiowetz, Meihua Tu, and Guo-Wei Wei. 2020. Are 2D fingerprints still valuable for drug discovery? Phys. Chem. Chem. Phys., Vol. 22 (2020), 8373--8390. Issue 16.Google ScholarCross Ref
- Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. 2017. Neural Message Passing for Quantum Chemistry. In ICML. 1263--1272.Google ScholarDigital Library
- Shion Honda, Shoi Shi, and Hiroki R Ueda. 2019. Smiles transformer: Pre-trained molecular fingerprint for low data drug discovery. arXiv:1911.04738 (2019).Google Scholar
- Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, and Jure Leskovec. 2021. OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2).Google Scholar
- Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020 a. Open Graph Benchmark: Datasets for Machine Learning on Graphs. arxiv, Vol. abs/2005.00687 (2020).Google Scholar
- Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. 2020 b. Strategies for Pre-training Graph Neural Networks. In ICLR.Google Scholar
- Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In ICML, Vol. 37. Lille, France, 448--456.Google Scholar
- Wengong Jin, Jonathan M. Stokes, Richard T. Eastman, Zina Itkin, Alexey V. Zakharov, James J. Collins, Tommi S. Jaakkola, and Regina Barzilay. 2021. Deep learning identifies synergistic drug combinations for treating COVID-19. Proceedings of the National Academy of Sciences, Vol. 118, 39 (2021).Google ScholarCross Ref
- Lukasz Kaiser, Aidan N. Gomez, Noam Shazeer, Ashish Vaswani, Niki Parmar, Llion Jones, and Jakob Uszkoreit. 2017. One Model To Learn Them All. CoRR, Vol. abs/1706.05137 (2017). showeprint[arXiv]1706.05137Google Scholar
- Charles FF Karney. 2007. Quaternions in molecular modeling. Journal of Molecular Graphics and Modelling, Vol. 25, 5 (2007), 595--604.Google ScholarCross Ref
- Johannes Klicpera, Janek Groß, and Stephan Günnemann. 2020. Directional Message Passing for Molecular Graphs. In ICLR .Google Scholar
- Kezhi Kong, Guohao Li, Mucong Ding, Zuxuan Wu, Chen Zhu, Bernard Ghanem, Gavin Taylor, and Tom Goldstein. 2020. FLAG: Adversarial Data Augmentation for Graph Neural Networks. CoRR, Vol. abs/2010.09891 (2020).Google Scholar
- Shengchao Liu, Hanchen Wang, Weiyang Liu, Joan Lasenby, Hongyu Guo, and Jian Tang. 2022. Pre-training Molecular Graph Representation with 3D Geometry. In ICLR.Google Scholar
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692 (2019).Google Scholar
- Chengqiang Lu, Qi Liu, Chao Wang, Zhenya Huang, Peize Lin, and Lixin He. 2019. Molecular property prediction: A multilevel quantum interactions modeling perspective. In AAAI, Vol. 33. 1052--1060.Google ScholarDigital Library
- Shitong Luo, Chence Shi, Minkai Xu, and Jian Tang. 2021. Predicting Molecular Conformation via Dynamic Graph Score Matching. In NeurIPS, Vol. 34.Google Scholar
- Elman Mansimov, Omar Mahmood, Seokho Kang, and Kyunghyun Cho. 2019. Molecular Geometry Prediction using a Deep Generative Graph Neural Network. Scientific Reports, Vol. 9, 1 (31 Dec 2019), 20381.Google Scholar
- Rocco Meli and Philip C. Biggin. 2020. spyrmsd: symmetry-corrected RMSD calculations in Python. Journal of Cheminformatics, Vol. 12, 1 (31 Aug 2020), 49.Google ScholarCross Ref
- Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-Shot Text-to-Image Generation. In ICML, Vol. 139. 8821--8831.Google Scholar
- Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying WEI, Wenbing Huang, and Junzhou Huang. 2020. Self-Supervised Graph Transformer on Large-Scale Molecular Data. In NeurIPS, Vol. 33. 12559--12571.Google Scholar
- Kristof Schütt, Oliver Unke, and Michael Gastegger. 2021. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In ICML.Google Scholar
- K. T. Schütt, P.-J. Kindermans, H. E. Sauceda, S. Chmiela, A. Tkatchenko, and K.-R. Müller. 2017. SchNet: A Continuous-Filter Convolutional Neural Network for Modeling Quantum Interactions. In NeurIPS (Long Beach, California, USA). Curran Associates Inc., Red Hook, NY, USA, 992--1002.Google Scholar
- Hari Om Sharan. 2021Artificial Intelligence in Bioinformatics. Springer Singapore, Singapore, 395--403. https://doi.org/10.1007/978-981-33-6191-1_21Google Scholar
- Chence Shi, Shitong Luo, Minkai Xu, and Jian Tang. 2021. Learning gradient fields for molecular conformation generation. In ICML. PMLR, 9558--9568.Google Scholar
- Zeren Shui and George Karypis. 2020. Heterogeneous molecular graph neural networks for predicting molecule properties. In ICDM. IEEE, 492--500.Google Scholar
- Gregor Simm and Jose Miguel Hernandez-Lobato. 2020. A Generative Model for Molecular Distance Geometry. In ICML, Vol. 119. PMLR, 8949--8958.Google Scholar
- Hannes Stärk, Dominique Beaini, Gabriele Corso, Prudencio Tossou, Christian Dallago, Stephan Günnemann, and Pietro Liò. 2021. 3D Infomax improves GNNs for Molecular Property Prediction. arXiv preprint arXiv:2110.04126 (2021).Google Scholar
- Hannes Stärk, Dominique Beaini, Gabriele Corso, Prudencio Tossou, Christian Dallago, Stephan Günnemann, and Pietro Liò. 2021. 3D Infomax improves GNNs for Molecular Property Prediction. CoRR, Vol. abs/2110.04126 (2021).Google Scholar
- Jonathan M Stokes, Kevin Yang, Kyle Swanson, Wengong Jin, Andres Cubillos-Ruiz, Nina M Donghia, Craig R MacNair, Shawn French, Lindsey A Carfrae, Zohar Bloom-Ackermann, et al. 2020. A deep learning approach to antibiotic discovery. Cell, Vol. 180, 4 (2020), 688--702.Google ScholarCross Ref
- Igor V. Tetko and Ola Engkvist. 2020. From Big Data to Artificial Intelligence: chemoinformatics meets new challenges. Journal of Cheminformatics, Vol. 12, 1 (18 Dec 2020), 74. https://doi.org/10.1186/s13321-020-00475-yGoogle ScholarCross Ref
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NeurIPS.Google Scholar
- Sheng Wang, Yuzhi Guo, Yuhong Wang, Hongmao Sun, and Junzhou Huang. 2019. Smiles-bert: large scale unsupervised pre-training for molecular property prediction. In Proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics. 429--436.Google ScholarDigital Library
- Yuyang Wang, Jianren Wang, Zhonglin Cao, and Amir Barati Farimani. 2021. MolCLR: molecular contrastive learning of representations via graph neural networks. arXiv preprint arXiv:2102.10056 (2021).Google Scholar
- David Weininger. 1988. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences, Vol. 28, 1 (1988), 31--36. https://doi.org/10.1021/ci00057a005Google ScholarDigital Library
- Kedi Wu and Guo-Wei Wei. 2018. Quantitative Toxicity Prediction Using Topology Based Multitask Deep Neural Networks. Journal of Chemical Information and Modeling, Vol. 58, 2 (2018), 520--531.Google ScholarCross Ref
- Zhenqin Wu, Bharath Ramsundar, Evan N Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S Pappu, Karl Leswing, and Vijay Pande. 2018. MoleculeNet: a benchmark for molecular machine learning. Chemical science, Vol. 9, 2 (2018), 513--530.Google Scholar
- Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In ICLR 2019.Google Scholar
- Minkai Xu, Lantao Yu, Yang Song, Chence Shi, Stefano Ermon, and Jian Tang. 2022. GeoDiff: A Geometric Diffusion Model for Molecular Conformation Generation. In ICLR.Google Scholar
- Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, and Tie-Yan Liu. 2021. Do Transformers Really Perform Badly for Graph Representation?. In Advances in Neural Information Processing Systems.Google Scholar
- Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. 2020. Graph Contrastive Learning with Augmentations. In NeurIPS.Google Scholar
- Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip HS Torr, and Vladlen Koltun. 2021. Point transformer. In ICCV. 16259--16268.Google Scholar
- Jinhua Zhu, Yingce Xia, Chang Liu, Lijun Wu, Shufang Xie, Tong Wang, Yusong Wang, Wengang Zhou, Tao Qin, Houqiang Li, and Tie-Yan Liu. 2022. Direct molecular conformation generation. arXiv preprint arXiv:2202.01356 (2022).Google Scholar
- Jinhua Zhu, Yingce Xia, Tao Qin, Wengang Zhou, Houqiang Li, and Tie-Yan Liu. 2021. Dual-view molecule pre-training. arXiv preprint arXiv:2106.10234 (2021).Google Scholar
Index Terms
- Unified 2D and 3D Pre-Training of Molecular Representations
Recommendations
Automated 3D Pre-Training for Molecular Property Prediction
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data MiningMolecular property prediction is an important problem in drug discovery and materials science. As geometric structures have been demonstrated necessary for molecular property prediction,3D information has been combined with various graph learning ...
Dual-view Molecular Pre-training
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data MiningMolecular pre-training, which is about to learn an effective representation for molecules on large amount of data, has attracted substantial attention in cheminformatics and bioinformatics. A molecule can be viewed as either a graph (where atoms are ...
Machine learning enabled identification of potential SARS-CoV-2 3CLpro inhibitors based on fixed molecular fingerprints and Graph-CNN neural representations
Graphical abstractDisplay Omitted
Highlights- ML models on different molecular representations for drug discovery against COVID-19.
Abstract AimRapidly developing AI and machine learning (ML) technologies can expedite therapeutic development and in the time of current pandemic their merits are particularly in focus. The purpose of this ...
Comments