Abstract
Protein–ligand docking is a useful tool for providing atomic-level understanding of protein functions in nature and design principles for artificial ligands or proteins with desired properties. The ability to identify the true binding pose of a ligand to a target protein among numerous possible candidate poses is an essential requirement for successful protein–ligand docking. Many previously developed docking scoring functions were trained to reproduce experimental binding affinities and were also used for scoring binding poses. However, in this study, we developed a new docking scoring function, called GalaxyDock BP2 Score, by directly training the scoring power of binding poses. This function is a hybrid of physics-based, empirical, and knowledge-based score terms that are balanced to strengthen the advantages of each component. The performance of the new scoring function exhibits significant improvement over existing scoring functions in decoy pose discrimination tests. In addition, when the score is used with the GalaxyDock2 protein–ligand docking program, it outperformed other state-of-the-art docking programs in docking tests on the Astex diverse set, the Cross2009 benchmark set, and the Astex non-native set. GalaxyDock BP2 Score and GalaxyDock2 with this score are freely available at http://galaxy.seoklab.org/softwares/galaxydock.html.
Similar content being viewed by others
References
Bohm HJ (1992) The computer-program Ludi—a new method for the denovo design of enzyme-inhibitors. J Comput Aid Mol Des 6(1):61–78. doi:10.1007/Bf00124387
Abagyan R, Totrov M, Kuznetsov D (1994) ICM—a new method for protein modeling and design—applications to docking and structure prediction from the distorted native conformation. J Comput Chem 15(5):488–506. doi:10.1002/jcc.540150503
Mizutani MY, Tomioka N, Itai A (1994) Rational automatic search method for stable docking models of protein and ligand. J Mol Biol 243(2):310–326. doi:10.1006/jmbi.1994.1656
Rarey M, Kramer B, Lengauer T, Klebe G (1996) A fast flexible docking method using an incremental construction algorithm. J Mol Biol 261(3):470–489. doi:10.1006/jmbi.1996.0477
Jones G, Willett P, Glen RC, Leach AR, Taylor R (1997) Development and validation of a genetic algorithm for flexible docking. J Mol Biol 267(3):727–748. doi:10.1006/jmbi.1996.0897
Morris GM, Goodsell DS, Halliday RS, Huey R, Hart WE, Belew RK, Olson AJ (1998) Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J Comput Chem 19(14):1639–1662. doi:10.1002/(Sici)1096-987X(19981115)19:14<1639::Aid-Jcc10>3.0.Co;2-B
Trosset JY, Scheraga HA (1999) PRODOCK: software package for protein modeling and docking. J Comput Chem 20(4):412–427. doi:10.1002/(Sici)1096-987X(199903)20:4<412::Aid-Jcc3>3.3.Co;2-E
Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD (2003) Improved protein-ligand docking using GOLD. Proteins 52(4):609–623. doi:10.1002/prot.10465
Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, Repasky MP, Knoll EH, Shelley M, Perry JK, Shaw DE, Francis P, Shenkin PS (2004) Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem 47(7):1739–1749. doi:10.1021/jm0306430
Lee K, Czaplewski C, Kim SY, Lee J (2005) An efficient molecular docking using conformational space annealing. J Comput Chem 26(1):78–87. doi:10.1002/jcc.20147
Friesner RA, Murphy RB, Repasky MP, Frye LL, Greenwood JR, Halgren TA, Sanschagrin PC, Mainz DT (2006) Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein-ligand complexes. J Med Chem 49(21):6177–6196. doi:10.1021/jm051256o
Korb O, Stutzle T, Exner TE (2006) PLANTS: Application of ant colony optimization to structure-based drug design. Lect Notes Comput Sci 4150:247–258
Steffen A, Kamper A, Lengauer T (2006) Flexible docking of ligands into synthetic receptors using a two-sided incremental construction algorithm. J Chem Inf Model 46(4):1695–1703. doi:10.1021/ci060072v
Chen HM, Liu BF, Huang HL, Hwang SF, Ho SY (2007) SODOCK: swarm optimization for highly flexible protein-ligand docking. J Comput Chem 28(2):612–623. doi:10.1002/jcc.20542
Jain AN (2007) Surflex-Dock 2.1: robust performance from ligand energetic modeling, ring flexibility, and knowledge-based search. J Comput Aid Mol Des 21(5):281–306. doi:10.1007/s10822-007-9114-2
Janson S, Merkle D, Middendorf M (2008) Molecular docking with multi-objective particle swarm optimization. Appl. Soft Comput 8(1):666–675. doi:10.1016/j.asoc.2007.05.005
Vilar S, Cozza G, Moro S (2008) Medicinal chemistry and the molecular operating environment (MOE): application of QSAR and molecular docking to drug discovery. Curr Top Med Chem 8(18):1555–1572. doi:10.2174/156802608786786624
Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, Olson AJ (2009) AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J Comput Chem 30(16):2785–2791. doi:10.1002/jcc.21256
Trott O, Olson AJ (2010) AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem 31(2):455–461. doi:10.1002/jcc.21334
McGann M (2011) FRED pose prediction and virtual screening accuracy. J Chem Inf Model 51(3):578–596. doi:10.1021/ci100436p
Spitzer R, Jain AN (2012) Surflex-Dock: docking benchmarks and real-world application. J Comput Aid Mol Des 26(6):687–699. doi:10.1007/s10822-011-9533-y
Shin WH, Kim JK, Kim DS, Seok C (2013) GalaxyDock2: protein-ligand docking using beta-complex and global optimization. J Comput Chem 34(30):2647–2656. doi:10.1002/jcc.23438
Allen WJ, Balius TE, Mukherjee S, Brozell SR, Moustakas DT, Lang PT, Case DA, Kuntz ID, Rizzo RC (2015) DOCK 6: impact of new features and current docking performance. J Comput Chem 36(15):1132–1156. doi:10.1002/jcc.23905
Eldridge MD, Murray CW, Auton TR, Paolini GV, Mee RP (1997) Empirical scoring functions 0.1. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes. J Comput Aid Mol Des 11(5):425–445. doi:10.1023/A:1007996124545
Bohm HJ (1998) Prediction of binding constants of protein ligands: a fast method for the prioritization of hits obtained from de novo design or 3D database search programs. J Comput Aid Mol Des 12(4):309–323. doi:10.1023/A:1007999920146
Wang RX, Lai LH, Wang SM (2002) Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J Comput Aid Mol Des 16(1):11–26. doi:10.1023/A:1016357811882
Huey R, Morris GM, Olson AJ, Goodsell DS (2007) A semiempirical free energy force field with charge-based desolvation. J Comput Chem 28(6):1145–1152. doi:10.1002/jcc.20634
Korb O, Stutzle T, Exner TE (2009) Empirical scoring functions for advanced protein-ligand docking with PLANTS. J Chem Inf Model 49(1):84–96. doi:10.1021/ci800298z
Bursulaya BD, Totrov M, Abagyan R, Brooks CL (2003) Comparative study of several algorithms for flexible ligand docking. J Comput Aid Mol Des 17(11):755–763. doi:10.1023/B:JCAM.0000017496.76572.6f
Wang RX, Lu YP, Wang SM (2003) Comparative evaluation of 11 scoring functions for molecular docking. J Med Chem 46(12):2287–2303. doi:10.1021/jm0203783
Ferrara P, Gohlke H, Price DJ, Klebe G, Brooks CL (2004) Assessing scoring functions for protein-ligand interactions. J Med Chem 47(12):3032–3047. doi:10.1021/jm030489h
Perola E, Walters WP, Charifson PS (2004) A detailed comparison of current docking and scoring methods on systems of pharmaceutical relevance. Proteins 56(2):235–249. doi:10.1002/prot.20088
Warren GL, Andrews CW, Capelli AM, Clarke B, LaLonde J, Lambert MH, Lindvall M, Nevins N, Semus SF, Senger S, Tedesco G, Wall ID, Woolven JM, Peishoff CE, Head MS (2006) A critical assessment of docking programs and scoring functions. J Med Chem 49(20):5912–5931. doi:10.1021/jm050362n
Zhou ZY, Felts AK, Friesner RA, Levy RM (2007) Comparative performance of several flexible docking programs and scoring functions: enrichment studies for a diverse set of pharmaceutically relevant targets. J Chem Inf Model 47(4):1599–1608. doi:10.1021/ci7000346
Cheng T, Li X, Li Y, Liu Z, Wang R (2009) Comparative assessment of scoring functions on a diverse test set. J Chem Inf Model 49(4):1079–1093. doi:10.1021/ci9000053
Cross JB, Thompson DC, Rai BK, Baber JC, Fan KY, Hu YB, Humblet C (2009) Comparison of several molecular docking programs: pose prediction and virtual screening accuracy. J Chem Inf Model 49(6):1455–1474. doi:10.1021/ci900056c
Li Y, Han L, Liu Z, Wang R (2014) Comparative assessment of scoring functions on an updated benchmark: 2. Evaluation methods and general results. J Chem Inf Model 54(6):1717–1736. doi:10.1021/ci500081m
Lee J, Scheraga HA, Rackovsky S (1997) New optimization method for conformational energy calculations on polypeptides: conformational space annealing. J Comput Chem 18(9):1222–1232. doi:10.1002/(Sici)1096-987X(19970715)18:9<1222::Aid-Jcc10>3.0.Co;2-7
Shin WH, Heo L, Lee J, Ko J, Seok C, Lee J (2011) LigDockCSA: protein-ligand docking using conformational space annealing. J Comput Chem 32(15):3226–3232. doi:10.1002/jcc.21905
Shin WH, Seok C (2012) GalaxyDock: protein-ligand docking with flexible protein side-chains. J Chem Inf Model 52(12):3225–3232. doi:10.1021/ci300342z
Gehlhaar DK, Verkhivker GM, Rejto PA, Sherman CJ, Fogel DB, Fogel LJ, Freer ST (1995) Molecular recognition of the inhibitor Ag-1343 by Hiv-1 protease—conformationally flexible docking by evolutionary programming. Chem Biol 2(5):317–324. doi:10.1016/1074-5521(95)90050-0
Zhang C, Liu S, Zhou YQ (2004) Accurate and efficient loop selections by the DFIRE-based all-atom statistical potential. Protein Sci 13(2):391–399. doi:10.1110/ps.03411904
Zhu J, Xie L, Honig B (2006) Structural refinement of protein segments containing secondary structure elements: local sampling, knowledge-based potentials, and clustering. Proteins 65(2):463–479. doi:10.1002/prot.21085
Chopra G, Kalisman N, Levitt M (2010) Consistent refinement of submitted models at CASP using a knowledge-based potential. Proteins 78(12):2668–2678. doi:10.1002/prot.22781
Park H, Seok C (2012) Refinement of unreliable local regions in template-based protein models. Proteins 80(8):1974–1986. doi:10.1002/prot.24086
Park H, Lee GR, Heo L, Seok C (2014) Protein loop modeling using a new hybrid energy function and its application to modeling in inaccurate structural environments. PloS ONE 9(11):e113811. doi:10.1371/journal.pone.0113811
Gohlke H, Hendlich M, Klebe G (2000) Knowledge-based scoring function to predict protein-ligand interactions. J Mol Biol 295(2):337–356. doi:10.1006/jmbi.1999.3371
Li Y, Liu Z, Li J, Han L, Liu J, Zhao Z, Wang R (2014) Comparative assessment of scoring functions on an updated benchmark: 1. Compilation of the test set. J Chem Inf Model 54(6):1700–1716. doi:10.1021/ci500080q
Hartshorn MJ, Verdonk ML, Chessari G, Brewerton SC, Mooij WT, Mortenson PN, Murray CW (2007) Diverse, high-quality test set for the validation of protein-ligand docking performance. J Med Chem 50(4):726–741. doi:10.1021/jm061277y
Verdonk ML, Mortenson PN, Hall RJ, Hartshorn MJ, Murray CW (2008) Protein-ligand docking against non-native protein conformers. J Chem Inf Model 48(11):2214–2225. doi:10.1021/ci8002254
Huang N, Shoichet BK, Irwin JJ (2006) Benchmarking sets for molecular docking. J Med Chem 49(23):6789–6801. doi:10.1021/jm0608356
Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE (2004) UCSF Chimera—a visualization system for exploratory research and analysis. J Comput Chem 25(13):1605–1612. doi:10.1002/jcc.20084
Gasteiger J, Marsili M (1980) Iterative partial equalization of orbital electronegativity—a rapid access to atomic charges. Tetrahedron 36(22):3219–3228. doi:10.1016/0040-4020(80)80168-2
MacKerell AD, Bashford D, Bellott M, Dunbrack RL, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph-McCarthy D, Kuchnir L, Kuczera K, Lau FT, Mattos C, Michnick S, Ngo T, Nguyen DT, Prodhom B, Reiher WE, Roux B, Schlenkrich M, Smith JC, Stote R, Straub J, Watanabe M, Wiorkiewicz-Kuczera J, Yin D, Karplus M (1998) All-atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B 102(18):3586–3616. doi:10.1021/jp973084f
Wang RX, Gao Y, Lai LH (2000) Calculating partition coefficient by atom-additive method. Perspect Drug Discov 19(1):47–66. doi:10.1023/A:1008763405023
Hendlich M, Bergner A, Gunther J, Klebe G (2003) Relibase: design and development of a database for comprehensive analysis of protein-ligand interactions. J Mol Biol 326(2):607–620
Neudert G, Klebe G (2011) DSX: a knowledge-based scoring function for the assessment of protein-ligand complexes. J Chem Inf Model 51(10):2731–2745. doi:10.1021/ci200274q
Hartigan JA, Wong MA (1979) Algorithm AS 136: A k-means clustering algorithm. J R Stat Soc Ser C (Appl Stat) 28(1):100–108
Park H, Ko J, Joo K, Lee J, Seok C, Lee J (2011) Refinement of protein termini in template-based modeling using conformational space annealing. Proteins 79(9):2725–2734. doi:10.1002/prot.23101
Korb O, Ten Brink T, Victor Paul Raj FR, Keil M, Exner TE (2012) Are predefined decoy sets of ligand poses able to quantify scoring function accuracy?. J Comput Aided Mol Des 26(2):185–197. doi:10.1007/s10822-011-9539-5
Halgren TA, Murphy RB, Friesner RA, Beard HS, Frye LL, Pollard WT, Banks JL (2004) Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. J Med Chem 47(7):1750–1759. doi:10.1021/jm030644s
Shoichet BK, Bodian DL, Kuntz ID (1992) Molecular docking using shape descriptors. J Comput Chem 13(3):380–397. doi:10.1002/jcc.540130311
Moustakas DT, Lang PT, Pegg S, Pettersen E, Kuntz ID, Brooijmans N, Rizzo RC (2006) Development and validation of a modular, extensible docking program: DOCK 5. J Comput Aid Mol Des 20(10–11):601–619. doi:10.1007/s10822-006-9060-4
Kramer B, Rarey M, Lengauer T (1999) Evaluation of the FLEXX incremental construction algorithm for protein-ligand docking. Proteins 37(2):228–241
Joseph-McCarthy D, Thomas BE, Belmarsh M, Moustakas D, Alvarez JC (2003) Pharmacophore-based molecular docking to account for ligand flexibility. Proteins 51(2):172–188. doi:10.1002/prot.10266
Gaudreault F, Najmanovich RJ (2015) FlexAID: revisiting docking on non-native-complex structures. J Chem Inf Model 55(7):1323–1336. doi:10.1021/acs.jcim.5b00078
Acknowledgements
This work was supported by the National Research Foundation of Korea (NRF) grants funded by the Ministry of Science, ICT & Future Planning (Nos. 2016R1A2A1A05005485 and 2016M3C4A7952630) and by the Korea Institute of Science and Technology Information supercomputing center (KSC-2015-C2-057). We thank Andrew Beaven for his careful reading of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Baek, M., Shin, WH., Chung, H.W. et al. GalaxyDock BP2 score: a hybrid scoring function for accurate protein–ligand docking. J Comput Aided Mol Des 31, 653–666 (2017). https://doi.org/10.1007/s10822-017-0030-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-017-0030-9