Skip to main content
Log in

Predicting partition coefficients for the SAMPL7 physical property challenge using the ClassicalGSG method

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

The prediction of \(\log P\) values is one part of the statistical assessment of the modeling of proteins and ligands (SAMPL) blind challenges. Here, we use a molecular graph representation method called Geometric Scattering for Graphs (GSG) to transform atomic attributes to molecular features. The atomic attributes used here are parameters from classical molecular force fields including partial charges and Lennard–Jones interaction parameters. The molecular features from GSG are used as inputs to neural networks that are trained using a “master” dataset comprised of over 41,000 unique \(\log P\) values. The specific molecular targets in the SAMPL7 \(\log P\) prediction challenge were unique in that they all contained a sulfonyl moeity. This motivated a set of ClassicalGSG submissions where predictors were trained on different subsets of the master dataset that are filtered according to chemical types and/or the presence of the sulfonyl moeity. We find that our ranked prediction obtained 5th place with an RMSE of 0.77 \(\log P\) units and an MAE of 0.62, while one of our non-ranked predictions achieved first place among all submissions with an RMSE of 0.55 and an MAE of 0.44. After the conclusion of the challenge we also examined the performance of open-source force field parameters that allow for an end-to-end \(\log P\) predictor model: General AMBER Force Field (GAFF), Universal Force Field (UFF), Merck Molecular Force Field 94 (MMFF94) and Ghemical. We find that ClassicalGSG models trained with atomic attributes from MMFF94 can yield more accurate predictions compared to those trained with CGenFF atomic attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (1997) Adv Drug Deliv Rev 23(1–3):3

    Article  CAS  Google Scholar 

  2. Noble A (1993) J Chromatogr A 642(1–2):3

    Article  CAS  Google Scholar 

  3. Paschke A, Neitzel PL, Walther W, Schüürmann G (2004) J Chem Eng Data 49(6):1639

    Article  CAS  Google Scholar 

  4. Sicbaldi F, Del Re AA (1993) Reviews of environmental contamination and toxicology. Springer, Berlin, pp 59–93

    Book  Google Scholar 

  5. Kajiya K, Ichiba M, Kuwabara M, Kumazawa S, Nakayama T (2001) Biosci Biotechnol Biochem 65(5):1227

    Article  CAS  PubMed  Google Scholar 

  6. Hermens JL, de Bruijn JH, Brooke DN (2013) Environ Toxicol Chem 32(4):732

    Article  CAS  PubMed  Google Scholar 

  7. Schwarzenbach RP, Gschwend PM, Imboden DM (2005) Environmental organic chemistry. Wiley, New York

    Google Scholar 

  8. Cheng T, Zhao Y, Li X, Lin F, Xu Y, Zhang X, Li Y, Wang R, Lai L (2007) J Chem Inf Model 47(6):2140

    Article  CAS  PubMed  Google Scholar 

  9. Ghose AK, Crippen GM (1986) J Comput Chem 7(4):565

    Article  CAS  Google Scholar 

  10. Leo AJ (1993) Chem Rev 93(4):1281

    Article  CAS  Google Scholar 

  11. Meylan WM, Howard PH (1995) J Pharm Sci 84(1):83

    Article  CAS  PubMed  Google Scholar 

  12. Plante J, Werner S (2018) J Cheminf 10(1):61

    Article  CAS  Google Scholar 

  13. Molnár L, Keserű GM, Papp Á, Gulyás Z, Darvas F (2004) Bioorg Med Chem Lett 14(4):851

    Article  CAS  PubMed  Google Scholar 

  14. Huuskonen JJ, Livingstone DJ, Tetko IV (2000) J Chem Inf Comput Sci 40(4):947

    Article  CAS  PubMed  Google Scholar 

  15. Moriguchi I, Hirono S, Liu Q, Nakagome I, Matsushita Y (1992) Chem Pharm Bull 40(1):127

    Article  CAS  Google Scholar 

  16. Chen D, Wang Q, Li Y, Li Y, Zhou H, Fan Y (2020) Chemosphere 247:125869

    Article  CAS  PubMed  Google Scholar 

  17. Mannhold R, Poda GI, Ostermann C, Tetko IV (2009) J Pharm Sci 98(3):861

    Article  CAS  PubMed  Google Scholar 

  18. Tetko IV, Tanchuk VY, Villa AE (2001) J Chem Inf Comput Sci 41(5):1407

    Article  CAS  PubMed  Google Scholar 

  19. ADMET Predictor(TM) version 2.3.0, Simulations Plus, Inc

  20. CSLogP version 2.2.0.0, ChemSilico LLC, USA,  http://www.chemsilico.com

  21. Silicos-it, Filter-it version 1.0.2, http://silicos-it.be.s3-website-eu-west-1.amazonaws.com/software/filter-it/1.0.2/filter-it.html

  22. Wu K, Zhao Z, Wang R, Wei GW (2018) J Comput Chem 39(20):1444

    Article  CAS  PubMed  Google Scholar 

  23. Korshunova M, Ginsburg B, Tropsha A, Isayev O (2021) J Chem Inf Model 61(1):7

    Article  CAS  PubMed  Google Scholar 

  24. Donyapour N, Hirn M, Dickson A (2021) J Comput Chem 42(14):1006

    Article  CAS  PubMed  Google Scholar 

  25. SAMPL challenges, http://samplchallenges.github.io

  26. Işık M, Bergazin TD, Fox T, Rizzi A, Chodera JD, Mobley DL (2020) J Comput Aid Mol Des 34(4):335–370

    Article  CAS  Google Scholar 

  27. Bergazin TD, Tielker N, Zhang Y, Mao J, Gunner MR, Ballatore C, Kast S, Mobley D et al (2021) ChemRxiv. https://doi.org/10.26434/chemrxiv.14461962.v1

    Article  Google Scholar 

  28. Popova M, Isayev O, Tropsha A (2018) Sci Adv 4(7):7885

    Article  CAS  Google Scholar 

  29. Lui R, Guan D, Matthews S (2020) J Comput Aid Mol Des 34:523

    Article  CAS  Google Scholar 

  30. Krämer A, Hudson PS, Jones MR, Brooks BR (2020) J Comput Aid Mol Des 32:983

    Google Scholar 

  31. Ding Y, Xu Y, Qian C, Chen J, Zhu J, Huang H, Shi Y, Huang J (2020) J Comput Aid Mol Des 298:31

    Google Scholar 

  32. Riquelme M, Vöhringer-Martinez E (2020) J Comput Aid Mol Des 34(1):39–54

    Article  CAS  Google Scholar 

  33. Fan S, Iorga BI, Beckstein O (2020) J Comput Aid Mol Des 30:1045

    Google Scholar 

  34. Procacci P, Guarnieri G (2019) J Comput Aid Mol Des 35:49–61

    Google Scholar 

  35. Marenich AV, Cramer CJ, Truhlar DG (2009) J Phys Chem B 113(18):6378

    Article  CAS  PubMed  Google Scholar 

  36. Loschen C, Reinisch J, Klamt A (2020) J Comput Aid Mol Des 34(4):385

    Article  CAS  Google Scholar 

  37. Tielker N, Tomazic D, Eberlein L, Güssregen S, Kast SM (2020) J Comput Aid Mol Des 34:709–715

    Article  CAS  Google Scholar 

  38. Guan D, Lui R, Matthews S (2020) J Comput Aid Mol Des 34:535

    Article  CAS  Google Scholar 

  39. Jones MR, Brooks BR (2020) J Comput Aid Mol Des 34:535

    Article  CAS  Google Scholar 

  40. Ouimet JA, Paluch AS (2020) J Comput Aid Mol Des 34:574

    Article  CAS  Google Scholar 

  41. Zamora WJ, Pinheiro S, German K, Ràfols C, Curutchet C, Luque FJ (2020) J Compu Aid Mol Des 34(4):443

    Article  CAS  Google Scholar 

  42. Wang S, Riniker S (2019) J Comput Aid Mol Des 34:393

    Article  CAS  Google Scholar 

  43. Patel P, Kuntz DM, Jones MR, Brooks BR, Wilson AK (2020) J Comput Aid Mol Des 34:495

    Article  CAS  Google Scholar 

  44. Arslan E, Findik BK, Aviyente V (2020) J Comput Aid Mol Des 34:463

    Article  CAS  Google Scholar 

  45. Port A, Bordas M, Enrech R, Pascual R, Rosés M, Ràfols C, Subirats X, Bosch E (2018) Eur J Pharm Sci 122:331

    Article  CAS  PubMed  Google Scholar 

  46. NonStar, logP database, https://ochem.eu/article/17434

  47. Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V (2015) J Chem Inf Model 55(2):263

    Article  CAS  PubMed  Google Scholar 

  48. Lusci A, Pollastri G, Baldi P (2013) J Chem Inf Model 53(7):1563

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  49. Feinberg EN, Sur D, Wu Z, Husic BE, Mai H, Li Y, Sun S, Yang J, Ramsundar B, Pande VS (2018) ACS Cent Sci 4(11):1520

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  50. Gao P, Zhang J, Sun Y, Yu J (2020) Phys Chem Chem Phys 22(41):23766

    Article  CAS  PubMed  Google Scholar 

  51. Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarell R, Hirzel T, Aspuru-Guzik A, Adams RP (2015) Adv Neural Inf Process Syst 28:2224–2232

    Google Scholar 

  52. Smith JS, Isayev O, Roitberg AE (2017) Chem Sci 8(4):3192

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  53. Gao F, Wolf G, Hirn M (2019) International conference on machine learning, pages 2122–2131

  54. Vanommeslaeghe K, MacKerell AD Jr (2012) J Chem Inf Model 52(12):3144

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  55. Vanommeslaeghe K, Raman EP, MacKerell AD Jr (2012) J Chem Inf Model 52(12):3155

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  56. Maier JA, Martinez C, Kasavajhala K, Wickstrom L, Hauser KE, Simmerling C (2015) J Chem Theory Comput 11(8):3696

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  57. Vassetti D, Pagliai M, Procacci P (2019) J Chem Theory Comput 15(3):1983

    Article  CAS  PubMed  Google Scholar 

  58. Wang J, Wolf RM, Caldwell JW, Kollman PA, Case DA (2004) J Comput Chem 25(9):1157

    Article  CAS  PubMed  Google Scholar 

  59. Rappé AK, Casewit CJ, Colwell K, Goddard WA III, Skiff WM (1992) J Am Chem Soc 114(25):10024

    Article  Google Scholar 

  60. Halgren TA (1996) J Comput Chem 17(5–6):490

    Article  CAS  Google Scholar 

  61. Halgren TA (1996) J Comput Chem 17(5–6):520

    Article  CAS  Google Scholar 

  62. Hassinen T, Peräkylä M (2001) J Comput Chem 22(12):1229

    Article  CAS  Google Scholar 

  63. Francisco KR, Varricchio C, Paniak TJ, Kozlowski MC, Brancale A, Ballatore C (2021) Eur J Med Chem 218:113399

    Article  CAS  PubMed  Google Scholar 

  64. RDkit, Open-source cheminformatics, https://www.rdkit.org

  65. Howard P, Meylan W (1999) Physical/chemical property database (PHYSPROP), Syracuse Research Corp, Environmental Science Center, North Syracuse, NY, 1999. http://www.syrres.com/esc/physdemo.htm

  66. Huuskonen JJ, Villa AE, Tetko IV (1999) J Pharm Sci 88(2):229

    Article  CAS  PubMed  Google Scholar 

  67. Klopman G, Li JY, Wang S, Dimayuga M (1994) J Chem Inf Comput Sci 34(4):752

    Article  CAS  Google Scholar 

  68. Hansch C, Leo A, Hoekman D (1995) Exploring QSAR: Fundamentals and Applications in Chemistry and Biology, American Chemical Society, Washington, DC

  69. O’Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR (2011) J Cheminf 3(1):33

    Article  CAS  Google Scholar 

  70. The Open babel package, version 3.1.1, http://openbabel.org

  71. Kipf TN, Welling M (2016) arXiv preprint arXiv:1609.02907

  72. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Adv Neural Inf Process Syst 32:8024–8035

    Google Scholar 

  73. Tietz M, Fan TJ, Nouri D, Bossan B (2017) skorch Developers, skorch: A scikit-learn compatible neural network library that wraps PyTorch. https://skorch.readthedocs.io/en/stable/

  74. Kingma DP, Ba J (2014) arXiv preprint arXiv:1412.6980

  75. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) J Mach Learn Res 12:2825

    Google Scholar 

  76. Heskes T, Wiegerinck W, Kappen H (1997) Prog Neural Process 375:128–135

    Google Scholar 

  77. Kumar S, Srivastava A (2012) Proceedings on 18th ACM SIGKDD conference knowledgement discovery data mining

  78. Nix DA, Weigend AS (1994) in Proceedings of 1994 ieee international conference on neural networks (ICNN’94), vol. 1 (IEEE, 1994), vol. 1, pp. 55–60

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alex Dickson.

Supplementary Information

Below is the link to the electronic supplementary material.

Electronic supplementary material 1 (PDF 128 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Donyapour, N., Dickson, A. Predicting partition coefficients for the SAMPL7 physical property challenge using the ClassicalGSG method. J Comput Aided Mol Des 35, 819–830 (2021). https://doi.org/10.1007/s10822-021-00400-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-021-00400-x

Keywords

Navigation