Rationality over fashion and hype in drug design

José L. Medina-Franco; Karina Martinez-Mayorga; Eli Fernández-de Gortari; Johannes Kirchmair; Jürgen Bajorath

doi:10.12688/f1000research.52676.1

Home Browse Rationality over fashion and hype in drug design

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Opinion Article

Rationality over fashion and hype in drug design

[version 1; peer review: 2 approved]

José L. Medina-Franco ¹, Karina Martinez-Mayorga², Eli Fernández-de Gortari³, Johannes Kirchmair⁴, Jürgen Bajorath⁵

José L. Medina-Franco ¹, Karina Martinez-Mayorga², [...] Eli Fernández-de Gortari³, Johannes Kirchmair⁴, Jürgen Bajorath⁵

PUBLISHED 18 May 2021

Author details Author details

¹ DIFACQUIM research group, Department of Pharmacy, School of Pharmacy, Universidad Nacional Autónoma de Méxic, Mexico City, 04510, Mexico
² Instituto de Química, Universidad Nacional Autónoma de Mexico, Mexico City, 04510, Mexico
³ Nanosafety Laboratory, International Iberian Nanotechnology Laboratory, Braga, 4715-330, Portugal
⁴ Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Vienna, 1090, Austria
⁵ Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Bonn, D-53115, Germany

José L. Medina-Franco
Roles: Conceptualization, Writing – Original Draft Preparation, Writing – Review & Editing

Karina Martinez-Mayorga
Roles: Writing – Original Draft Preparation, Writing – Review & Editing

Eli Fernández-de Gortari
Roles: Writing – Original Draft Preparation, Writing – Review & Editing

Johannes Kirchmair
Roles: Writing – Original Draft Preparation, Writing – Review & Editing

Jürgen Bajorath
Roles: Writing – Original Draft Preparation, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Cheminformatics gateway.

This article is included in the Artificial Intelligence and Machine Learning gateway.

Abstract

The current hype associated with machine learning and artificial intelligence often confuses scientists and students and may lead to uncritical or inappropriate applications of computational approaches. Even the field of computer-aided drug design (CADD) is not an exception. The situation is ambivalent. On one hand, more scientists are becoming aware of the benefits of learning from available data and are beginning to derive predictive models before designing experiments. However, on the other hand, easy accessibility of in silico tools comes at the risk of using them as “black boxes” without sufficient expert knowledge, leading to widespread misconceptions and problems. For example, results of computations may be taken at face value as “nothing but the truth” and data visualization may be used only to generate “pretty and colorful pictures”. Computational experts might come to the rescue and help to re-direct such efforts, for example, by guiding interested novices to conduct meaningful data analysis, make scientifically sound predictions, and communicate the findings in a rigorous manner. However, this is not always ensured. This contribution aims to encourage investigators entering the CADD arena to obtain adequate computational training, communicate or collaborate with experts, and become aware of the fundamentals of computational methods and their given limitations, beyond the hype. By its very nature, this Opinion is partly subjective and we do not attempt to provide a comprehensive guide to the best practices of CADD; instead, we wish to stimulate an open discussion within the scientific community and advocate rational rather than fashion-driven use of computational methods. We take advantage of the open peer-review culture of F1000Research such that reviewers and interested readers may engage in this discussion and obtain credits for their candid personal views and comments. We hope that this open discussion forum will contribute to shaping the future practice of CADD.

Keywords

artificial intelligence, computer-aided drug design, drug discovery, chemoinformatics, education, open science

Corresponding author: José L. Medina-Franco

Competing interests: No competing interests were disclosed.

Grant information: KM-M thanks DGAPA-PASPA for financial support.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © 2021 Medina-Franco JL et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Medina-Franco JL, Martinez-Mayorga K, Fernández-de Gortari E et al. Rationality over fashion and hype in drug design [version 1; peer review: 2 approved]. F1000Research 2021, 10(Chem Inf Sci):397 (https://doi.org/10.12688/f1000research.52676.1) First published: 18 May 2021, 10(Chem Inf Sci):397 (https://doi.org/10.12688/f1000research.52676.1) Latest published: 18 May 2021, 10(Chem Inf Sci):397 (https://doi.org/10.12688/f1000research.52676.1)

Computer-aided drug discovery

Computer-aided drug discovery (CADD) has become a key technology in drug discovery, providing guidance to experimentalists on which compounds and experiments to focus on next. The capacity of CADD has further increased by the development of powerful machine learning approaches, deep learning in particular.¹ In recent years, software for CADD has become more widely accessible. Today, a range of software packages are available that are open-source or free to use for academic research.^2–5 Together with significant alterations of the scientific landscape induced by the COVID-19 pandemic and the ensuing re-orientation of some early-career but also established research groups towards the use of computational tools, this has boosted the use of CADD methods in particular in academic research environments. However, low-barrier access to computational tools and computing power increasingly leads to the use of CADD techniques also by scientists who have not received formal training on these methods. Many newcomers to CADD employ easy-to-use software without realizing the complexities involved and without being aware of the many potential pitfalls. Improper use of CADD techniques can have a contrary effect on research than intended. The risk of generating meaningless or false predictions is high. Flawed predictions can lead to dedicating significant resources to futile experiments. In particular, publishing invalid predictions can lead to error propagation, eventually resulting in a loss of confidence in CADD. Moreover, uncritical or naive use of artificial intelligence (AI) methods that are being heavily promoted in many research fields has similar negative effects, working against the credibility and acceptance of CADD as a scientific discipline.

Good practices to conduct and report studies in computational medicinal chemistry and chemoinformatics have been outlined in several articles.^3,6,7 Similarly, best practices in different stages involved in CADD have been discussed, for instance, in quantitative structure-activity relationship (QSAR) analysis,⁸ data curation,⁹ molecular docking^10,11 and virtual screening.¹²

In this contribution, we discuss common misconceptions and false expectations associated with CADD, especially in the AI era, and make recommendations on how to avoid common pitfalls when using CADD software. We aim to stimulate an open discussion within the community to help improve our perception and practice of CADD and contribute to shaping its future.

CADD and related fields

Experts from various disciplines involved in drug discovery, such as chemical synthesis and biochemistry, are increasingly making use of computational tools to guide their experimental research and rationalize their observations. This is a positive trend, as developers of CADD tools have been aiming for a long time to make their software more widely accessible and intuitive to use. Prominent examples include web servers, and commercial, free and open-source software. But to develop further, it is of paramount importance to avoid confusion about the concepts and differences between the different areas in drug discovery such as molecular modeling, chemoinformatics, and theoretical chemistry. Conceptual and practical differences between these disciplines are clearly described in the literature.^13,14 Importantly, theoretical disciplines are an integral part of CADD, establishing its scientific foundations. The ability to apply such approaches using software does by no means guarantee that reasonable research is carried out. Therefore, any conclusions or claims should be carefully considered.

Common misconceptions and false perceptions when CADD is superficially viewed

The advent of more advanced computational tools, many of which are open source, freely accessible, and promoted as “easy-to-use,” also increases the widespread use of “buzz words,” and misconceptions among newcomers to CADD.

Table 1 gives examples of incorrect expressions and misconceptions frequently affecting students and researchers with little or no expertise in CADD. Readers and peer reviewers are welcome to comment openly on these points and modify or enrich the list according to their own experience.

Table 1. Examples of misconceptions versus the intended use accepted by experts in the field.

Misconceptions or misleading statements	Correct meaning
Computational methods are fast and cheap, and in particular in the pandemic situation we can, and should, put experimental research on hold and turn to in silico methods.	In silico studies may be conducted independently of experiments. In fact, theoretical approaches may address research questions that are beyond experimental accessibility. Ideally, computation and measurement are integrated, e.g. for the purpose of model validation and in applied research.
Computational studies are fast and easy to conduct, and they always produce results.	Depending on the research question, computational projects can in fact be resource-demanding and time-consuming. The fact that they “always” produce results does not mean that these results are “always” valid (this is certainly not the case). Use of any results without careful vetting is related to a significant risk of predictions being false or inaccurate.
Purely computational studies have limited value and do not represent standalone projects.	A purely computational study can be self-contained and comprehensive and may address research questions going beyond experimental accessibility.
In multidisciplinary projects, experimental testing is difficult but computational studies are easy.	Both computational and experimental work might be routine or challenging. The development and validation of new algorithms may well exceed the magnitude of experimental work.
Computational analysis mostly contributes catchy pictures to publications and grant applications.	If properly conducted, computational analysis can rationalize experimental observations and yield experimentally testable hypotheses.
Machine learning and AI are the new standard for CADD.	Machine learning is a part of AI and already has a long record of use in CADD. While being important to many types of predictions and enabling new applications, machine learning methods on their own have not revolutionized the field (yet).
Molecular modeling and chemoinformatics are other terms for CADD.	CADD covers various theoretical disciplines including molecular modeling, chemoinformatics, bioinformatics, theoretical chemistry, and machine learning.
Molecular docking can be used to demonstrate ligand binding.	Molecular docking approximates protein-ligand interactions and binding modes in a computational complex manner that only partly resembles physical binding events. Entropic effects in particular are only poorly considered by docking approaches.
Rational drug design must incorporate a computational analysis.	A drug can be rationally designed based on prior knowledge, experience and even causal intuition, without the need to employ a computer.
Computational results are unbiased and thus most likely correct.	Any computational analysis is affected by methodological limitations in accounting for the physical reality. Hence, results must be interpreted with caution and awareness of such limitations. It is ultimately the responsibility of the researcher to arrive at a scientifically sound and trustworthy interpretation of the results.
Most computational techniques can be learned in a few hours of hands-on workshops.	How to execute a software might be learned rapidly, but understanding the theory behind a computational approach and gaining the experience essential to the correct interpretation of the relevance, meaning and reliability of predictions can be demanding and take a significant amount of time. Without a firm grasp of the underlying theoretical foundations, computational exercises may only lead to appealing but meaningless illustrations.
In contrast to reagents for organic synthesis, no “purification” is needed for the input of computational approaches.	Data curation is of fundamental importance to CADD. Without proper curation of the input data, no meaningful predictions will be possible. The role of data curation in CADD equals that of experimental preparation and reagent purification in chemical synthesis.
In contrast to reaction mechanisms in organic synthesis, understanding of how an algorithm works is not required in order to produce predictions.	Understanding algorithms is as important as understanding experimental approaches.
In contrast to a biochemical assay, controls are not required in a computational analysis.	Including controls into the calculations is an indispensable part of any scientifically sound computational study.
There are no computational “experiments”.	An inferior computational analysis will produce results meeting our pre-formed expectations. A properly designed computational study will unambiguously address a question or hypothesis for which we have no answer to yet. This represents, in its best sense, an “in silico experiment”.

When is a study “complete”?

Traditionally, there is a widespread belief in experimental sciences that experimental results represent reality, disregarding the different way in which natural phenomena can be represented and perceived and the relativity associated with varying representations. This ideological attitude works against “out of the box”, hinders intellectual progress, and indirectly de-values scientific disciplines such as CADD. With the rise of AI as one of the most heavily promoted approaches in contemporary society, the academic community has been encouraged to redirect its attention to computational tools to enhance its research impact and appeal. Nevertheless, unconditional trust in “experimental reality” reduces CADD to a “tool provider” and does not regard it as an independent scientific discipline in its own right. Consequently, computational models are often used without the necessary theoretical understanding and the rigor needed to apply them systematically.

In the authors’ opinion, one of the first requirements a new computational practitioner needs to address is realizing that both experimental and computational results are constrained by the model or experimental framework applied to determine them and, in no case, an absolute account of reality. Among medicinal chemists, there is the frequent misconception that purely theoretical or computational studies are in principle “incomplete” because there are no “real” experiments. However, such views require reconsideration and correction, as pointed out above. Rigorous computational studies answer questions that are difficult to address without “in silico experiments”. As such, they are comprehensive and self-contained, regardless of whether the computational approach has led to experiments. “Complete” computational investigations are often consistent with prior experimental observations, but may also chart new scientific territory. Of course, new computational insights leading to experimental work trigger interdisciplinary research. This is a noted strength of CADD, if conducted properly. However, there are misconceptions at interfaces between computation and experiment. For instance, a common malpractice is trying to replace enzymatic inhibition assays with predictions based on molecular docking or dynamics simulations. Another misunderstanding is that black box predictions from machine learning would represent a form of “alchemy”. What we cannot understand is not necessarily incorrect and may have value. The catch is that we are left with making decisions in such cases, for example, about new experiments that go beyond our reasoning and hence require trust in computational work and prior experience. It is also false to believe that AI in its current state would provide solutions to questions that replace our judgment capacity. Data volumes quickly go beyond our comprehension but results of statistical analysis of pattern recognition do not replace human reasoning (algorithms and machines do not “think” -- at least so far). Furthermore, there is a severe misconception that computational predictions might demonstrate or “validate” the bioactivity of compounds. Notably, these and other misunderstandings may not be evident to researchers and students who are just beginning to use computational methods. We encourage the community to avoid judging a computational research project to be “incomplete” because it does not include experiments or to be “complete” just because it incorporates many different computational methods. The question of completeness is not separable from scientific rigor and adequate conduct of methodologies, be they computational or experimental in nature. Furthermore, let us not consider a computational analysis as a “luxury item” to decorate a project report, grant application, or scientific paper with “pretty pictures”.

Using methods for the right reasons

Mainstream media usually disseminate inaccurate or exaggerated reports about the capabilities of computational methods without also mentioning their limitations and flaws. Simultaneously, mass job search engines commonly offer job opportunities with extensive lists of different computational tools as requirements. These factors, among others, continuously put pressure on researchers to increase their productivity and academic credentials to further their careers at the expense of scientific rigor and the quality of research. CADD is not the exception of the increasing trend that disrupts the traditional academic structure in favor of a more market-oriented approach. One of the consequences of this phenomenon is that many young professionals and new CADD practitioners direct their efforts to increase the volume of their curriculum vitae rather than using CADD methods to answer relevant scientific questions.

The popularity of computational methods or tools often jeopardizes rational selection. Newcomers often turn to frequently used methods that are well-validated. However, the justification is questionable if the technique is merely used because it is “popular” (e.g., “follow the crowd” because “it should be right.”). Without properly addressing the question at hand, computational analysis applying irrelevant approaches is misleading or propagates errors. Arguably, one of the most misused guidelines in drug discovery is the Lipinski Rule of Five,¹⁵ which is often confused with assessing “drug-likeness”. Another common pitfall among newcomers to CADD is using docking to predict “real” protein-ligand complexes, given its popularity and easy-of-use. Practitioners should use methods for the right reasons and not just because everybody else is using them. This requires knowledge of underlying theories and sound scientific judgment.

General recommendations for the proper use of CADD resources

In the authors’ opinion, the following recommendations should be helpful to CADD novices and multidisciplinary research teams attempting or planning to use computational approaches to guide drug discovery projects. Similar to Table 1, the list is not exhaustive, but is also intended to stimulate an open discussion within the scientific community.

• Intense study of the literature is essential to acquire knowledge. Like experimental techniques, also CADD methods require proper training to become familiar with their applicability domains, approximations and limitations.
• Computational research projects should primarily be problem-oriented rather than technique-oriented, unless the development of new techniques themselves is the focus of the problem to be addressed. Projects (including dissertations) should be well-structured according to scientific criteria or milestones, but by no means represent a compilation or aggregated use of techniques applied to the same data. Before deciding which computational approaches and tools to use, a comprehensive research of the literature should be conducted and exemplary applications should be reviewed. Then, based on the experimental information available, appropriate computational methods and strategies should be applied. Rushing into calculations with software packages, even with excitement, is typically detrimental if the applied methods are not scientifically justified. In addition to researching methodological aspects, it is also mandatory to carefully review the available experimental findings. For example, prior to applying virtual screening techniques to search for new compounds with activity against high-profile and extensively explored targets, care should be taken not to overlook prior art in the field and avoid engaging in scientifically naive computational efforts. One should avoid setting the goals of a drug discovery project relative to a technique by pursuing a “tool-oriented approach”. Instead, planning computational components of an interdisciplinary research project should focus on the ultimate scientific goals. Students should realize and keep in mind that learning and applying different techniques across disciplines is desirable, but they should be used in harmony to answer research questions.
• Seek supervision or advice from experts and do not hesitate to ask. Consultation prior to engaging in a new scientific adventure will not only save time and resources but also help to plan a scientifically sound approach.
• Avoid excessive use of buzzwords such as “artificial intelligence” or “machine learning” when they are not applicable, which contributes to inappropriate hype associated with computational methods. For example, there is no need to use the AI or “machine intelligence” label for compound classification methods that are already applied for decades.
• Keep in mind that many theoretical disciplines contribute to CADD, which have a long history on their own such as machine learning.
• As in any wet lab experiment, input data quality is of critical importance for the outcome of computational studies. Awareness of data curation requirements is essential for the integrity of computational work.
• The uncritical or uneducated use of web-accessible computational tools or servers to generate new compounds, calculate molecular properties, or predict target structures and protein-ligand complexes is a major source of errors propagating through interdisciplinary projects.

Concluding remarks

The current pandemic and related funding constraints in some countries and institutions have motivated many researchers at different levels to redirect their efforts from difficult to sustain experimental studies to easy-to-use computational tools that can be employed remotely. Also, reviewer panels of many current grant applications from academia, non-for profit, or the industry currently tend to give priority to research proposals that involve AI. Although this contributes to the popularity of CADD, it also comes at a cost. If uneducated CADD studies enter the realm of science fiction harm is done to this field, its credibility and acceptance, and further scientific development. This must be avoided at all costs. The methods used in CADD should not be applied as black boxes, which can be enabled by just a few hours of hands-on experience such as provided in workshops. Without sufficient understanding of the scope, complexity, and theoretical foundations of these computational methodologies such efforts will inevitably fail and discredit investigators and their work as well as the field as a whole. Newcomers to the area, including students, early-career scientists, and seasoned investigators attempting to re-focus their efforts should be fully aware that, similar to experiments, profound knowledge of CADD concepts and informed use of CADD tools is a must. A simple yet fundamentally important rule applies: “Don’t compute what you don’t understand”. In addition to the general recommendations outlined in this Opinion, we wish to encourage students, newcomers, and practitioners of CADD to use the computational tools and resources for the right reasons, not just because they are easily accessible. Similarly, we highly encourage the scientific community to avoid applying computational methods just because they are popular. Instead, it is strongly recommended to identify scientific questions that can be addressed appropriately using CADD approaches - and avoid others where computational efforts become questionable. In general, computational studies that cannot be reported in established peer-reviewed journals whose scope includes CADD are to be considered with appropriate caution, both by experts and novices to the field. This also applies to the use of modeling web servers. While the integrity of publicly accessible computational tools can be guaranteed by the developers, addressing ill-defined questions or tasks using these tools is beyond their control. Recognizing the benefits of the open post-publication review culture of F1000Research, we would be delighted if this contribution would catalyze open discussions among readers to raise further awareness of latent problematic issues in the CADD area and support its further scientific development.

Data availability

No data is associated with this article.

Acknowledgments

We thank Zoe Sessions for critically proofreading the manuscript.

References

1. Gasteiger J: Chemistry in Times of Artificial Intelligence. ChemPhysChem. 2020; 21(20): 2233–2242. PubMed Abstract | Publisher Full Text | Free Full Text
2. Singh N, Chaput L, Villoutreix BO: Virtual Screening Web Servers: Designing Chemical Probes and Drug Candidates in the Cyberspace. Brief. Bioinform. 2021; 22(2): 1790–1818. PubMed Abstract | Publisher Full Text | Free Full Text
3. Willems H, De Cesco S, Svensson F: Computational Chemistry on a Budget: Supporting Drug Discovery with Limited Resources. J. Med. Chem. 2020; 63(18): 10158–10169. PubMed Abstract | Publisher Full Text
4. Click2Drug: Accessed Apr 7, 2021.Reference Source
5. Macs in Chemistry: Accessed Apr 7, 2021.Reference Source
6. Bajorath J: Progress in Computational Medicinal Chemistry. J. Med. Chem. 2012; 55(8): 3593–3594. PubMed Abstract | Publisher Full Text
7. Merz KM Jr, Amaro R, Cournia Z, et al.: Editorial: Method and Data Sharing and Reproducibility of Scientific Results. J. Chem. Inf. Model. 2020: 60(12): 5868–5869. PubMed Abstract | Publisher Full Text
8. Muratov EN, Bajorath J, Sheridan RP, et al.: QSAR without Borders. Chem. Soc. Rev. 2020; 49(11): 3525–3564. PubMed Abstract | Publisher Full Text | Free Full Text
9. Fourches D, Muratov E, Tropsha A: Trust, but Verify II: A Practical Guide to Chemogenomics Data Curation. J. Chem. Inf. Model. 2016; 56(7): 1243–1252. PubMed Abstract | Publisher Full Text | Free Full Text
10. Temml V, Schuster D: Molecular Docking for Natural Product Investigations: Pitfalls and Ways to Overcome Them. In: Molecular Docking for Computer-Aided Drug Design. Elsevier; 2021; pp 391–405.
11. Scior T: Do It Yourself—Dock It Yourself: General Concepts and Practical Considerations for Beginners to Start Molecular Ligand–Target Docking Simulations. In: Molecular Docking for Computer-Aided Drug Design. Elsevier; 2021; pp 205–227.
12. Scior T, Bender A, Tresadern G, et al.: Recognizing Pitfalls in Virtual Screening: A Critical Review. J. Chem. Inf. Model. 2012; 52(4): 867–881. PubMed Abstract | Publisher Full Text
13. Varnek A, Baskin II: Chemoinformatics as a Theoretical Chemistry Discipline. Mol. Inform. 2011; 30(1): 20–32. PubMed Abstract | Publisher Full Text
14. López-López E, Bajorath J, Medina-Franco JL: Informatics for Chemistry, Biology, and Biomedical Sciences. J. Chem. Inf. Model. 2021; 61(1): 26–35. PubMed Abstract | Publisher Full Text
15. Lipinski CA: Lead- and Drug-like Compounds: The Rule-of-Five Revolution. Drug Discov. Today Technol. 2004; 1(4): 337–341. PubMed Abstract | Publisher Full Text

Comments on this article Comments (7)

Version 1

VERSION 1 PUBLISHED 18 May 2021

Reader Comment 02 Sep 2021

Piotr Minkiewicz, University of Warmia and Mazury, Olsztyn, Poland

02 Sep 2021

Reader Comment

I would like to thank Authors for this publication. I would like also point out that area of applicability of concepts presented here is broader than drug design. They may ... Continue reading I would like to thank Authors for this publication. I would like also point out that area of applicability of concepts presented here is broader than drug design. They may be applied also to in silico research concerning many bioactive components of food. Moreover, table containing misconceptions and correct meaning of cheminformatics concepts is extremely useful as potential part of lectures concerning cheminformatics and bioinformatics.
I would like to thank Authors for this publication. I would like also point out that area of applicability of concepts presented here is broader than drug design. They may be applied also to in silico research concerning many bioactive components of food. Moreover, table containing misconceptions and correct meaning of cheminformatics concepts is extremely useful as potential part of lectures concerning cheminformatics and bioinformatics.
Competing Interests: No competing interests were disclosed Close
Report a concern
Reader Comment 21 Jun 2021

Mohammad Rizki Fadhil Pratama, Universitas Muhammadiyah Palangkaraya, Indonesia

21 Jun 2021

Reader Comment

I will get to the point, I agree with the contents of this version of the manuscript. The COVID-19 pandemic has indeed opened up both good and bad sides of ... Continue reading I will get to the point, I agree with the contents of this version of the manuscript. The COVID-19 pandemic has indeed opened up both good and bad sides of CADD. CADD research is becoming very popular, especially related to COVID-19, but on the other hand, new CADD "experts" have emerged, who unfortunately are not really experienced in this field. Hopefully, this paper can open up readers' insights regarding the "dark" and "light" sides of CADD before actually deciding to delve into it.
I will get to the point, I agree with the contents of this version of the manuscript. The COVID-19 pandemic has indeed opened up both good and bad sides of CADD. CADD research is becoming very popular, especially related to COVID-19, but on the other hand, new CADD "experts" have emerged, who unfortunately are not really experienced in this field. Hopefully, this paper can open up readers' insights regarding the "dark" and "light" sides of CADD before actually deciding to delve into it.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Reader Comment 17 Jun 2021

Marawan null, University of Alberta, Academic, Canada

17 Jun 2021

Reader Comment

Nice read. As someone who is specialized in CADD and molecular modeling, I would admit that the concerns that the author raised are true. What was very awkward to me ... Continue reading Nice read. As someone who is specialized in CADD and molecular modeling, I would admit that the concerns that the author raised are true. What was very awkward to me was the surge of papers claiming to find real SARS-CoV-2 treatments by just doing a bunch of 'button pressings' on molecular docking software. That was really annoying. As someone who learned the field and developed the skills from the bottom, from basic quantum chemistry models, I found myself in a dilemma when I see these papers with crude MM forcefields claiming that they have found a solution to a problem that no one else did. I just wanted to add to this that experimentalists should be also worried. There is no guarantee that what they measure in a lab is the truth. At the best, it is still an approximation to what happens in reality, e.g. in the human body. CADD is like a mining area where people have thrown many fake stones, only an expert can find the real ones that are -unfortunately- very rare.
Nice read. As someone who is specialized in CADD and molecular modeling, I would admit that the concerns that the author raised are true. What was very awkward to me was the surge of papers claiming to find real SARS-CoV-2 treatments by just doing a bunch of 'button pressings' on molecular docking software. That was really annoying. As someone who learned the field and developed the skills from the bottom, from basic quantum chemistry models, I found myself in a dilemma when I see these papers with crude MM forcefields claiming that they have found a solution to a problem that no one else did. I just wanted to add to this that experimentalists should be also worried. There is no guarantee that what they measure in a lab is the truth. At the best, it is still an approximation to what happens in reality, e.g. in the human body. CADD is like a mining area where people have thrown many fake stones, only an expert can find the real ones that are -unfortunately- very rare.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Reader Comment 14 Jun 2021

Filip Miljković, AstraZeneca, Sweden

14 Jun 2021

Reader Comment
Thanks to the authors for compiling a comprehensive list of misconceptions faced in the field of CADD (and related disciplines/sub-disciplines) - I agree with each point made. I suggest the ... Continue reading
Thanks to the authors for compiling a comprehensive list of misconceptions faced in the field of CADD (and related disciplines/sub-disciplines) - I agree with each point made. I suggest the authors to regularly update/publish this piece of work so that CADD practitioners could be constantly reminded to treat their work with scientific rigour. Possibly, if sufficient effort is made, some of these concepts could be implemented by CADD journals and present basis for the manuscript revision. I would add few more points as a suggestion to the authors:

(Published) data is unquestionable and could be used at face value - Combination of outcomes from different experimental techniques requires full attention and greater detail of the data curation process is required to understand if results coming from the processed data are worthy. Often, published data sets are taken at face value and used to benchmark novel/other methods against already published results. Further curation and exploration is required before committing to computational experiments.

Any performance metric is good enough to evaluate my models - Model performance should be evaluated using several different performance metrics and against experimental error estimate or dummy model performance whenever possible.

Performance increase by few points makes my model better than current state-of-the-art - Hardly anyone will accept new approach as worthy if performance increase is only minimal - this might have been something achieved by chance and requires further exploration (i.e. results not worth publishing).

Deep learning is better than conventional methods - This is often not true and conventional machine learning methods should be used instead when no significant performance is achieved by a 'cool new technique'.

Preprint articles are as good as peer-reviewed papers - Preprints should be treated with caution and utilized only after the peer-review process - it may happen that the study gets rejected or changed significantly so that initial claims have little or no value.
Thanks to the authors for compiling a comprehensive list of misconceptions faced in the field of CADD (and related disciplines/sub-disciplines) - I agree with each point made. I suggest the authors to regularly update/publish this piece of work so that CADD practitioners could be constantly reminded to treat their work with scientific rigour. Possibly, if sufficient effort is made, some of these concepts could be implemented by CADD journals and present basis for the manuscript revision. I would add few more points as a suggestion to the authors:

(Published) data is unquestionable and could be used at face value - Combination of outcomes from different experimental techniques requires full attention and greater detail of the data curation process is required to understand if results coming from the processed data are worthy. Often, published data sets are taken at face value and used to benchmark novel/other methods against already published results. Further curation and exploration is required before committing to computational experiments.

Any performance metric is good enough to evaluate my models - Model performance should be evaluated using several different performance metrics and against experimental error estimate or dummy model performance whenever possible.

Performance increase by few points makes my model better than current state-of-the-art - Hardly anyone will accept new approach as worthy if performance increase is only minimal - this might have been something achieved by chance and requires further exploration (i.e. results not worth publishing).

Deep learning is better than conventional methods - This is often not true and conventional machine learning methods should be used instead when no significant performance is achieved by a 'cool new technique'.

Preprint articles are as good as peer-reviewed papers - Preprints should be treated with caution and utilized only after the peer-review process - it may happen that the study gets rejected or changed significantly so that initial claims have little or no value.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Reader Comment 14 Jun 2021

Abel Suárez, Universidad Michoacana de San Nicolás de Hidalgo, Institute for Chemical Biological Research, Mexico

14 Jun 2021

Reader Comment

Thank you very much for sharing this article that I consider to be of utmost importance for those who are beginning to go through the CADD, also for those of ... Continue reading Thank you very much for sharing this article that I consider to be of utmost importance for those who are beginning to go through the CADD, also for those of us who have had a little experience and who on some occasions have made basic errors that have impacted on the conclusions or on the discussion of our research.

I believe that it will also be very useful for those who have extensive experience in the area, as it will surely awaken the interest in preparing and motivating beginners to use the CADD tools always aimed to answer relevant questions and objectives within the investigation.

Let me write some points of view once I have read the article in its entirety:

I believe that the main problem is that most of those who are dedicated to this area of chemistry, we start without any formal academic preparation and that the first calculations that we develop can lead us to generate systematic errors, becoming a mechanical task without knowing the deep of the fundamental principles about how a CADD tool or method works, causing us to take them as “promising” results (and other adjectives used arbitrarily) when in fact they are not.I have always kept in mind that the “results” that we obtain from CADD are predictions of variables that may or may not be observable through an experiment, and that these predictions are always subject to other variables that are not included in a CADD method or tool due to its nature and limitations.

Regarding Table 1:
I confess that the concepts of theoretical chemistry, molecular modelling and chemonformatics, personally, had been completely confused for a long time due to the lack of academic training in CADD (although this is not justification).

I am very surprised by the way and the context in which the erroneous concepts are established, since they seem to me to be very basic errors and I realize that at least I have not pronounced them or that I have not had those ideas, however, at the time of reading ( especially in social networks) to many students and users of CADD tools, I realize that this is the case, as shown in Table 1, it seems that CADD is a quick way to establish associations, relationships and a way to draw conclusions that are "safe and plausible."

Another error that I have frequently seen is that sometimes it is pronounced or taken for granted that the "results" obtained indicate that this or that chemical compound will be a good candidate to be a drug, some even dare to call it "inhibitor ” when there is not even experimental information that has demonstrated that biological activity.

Another error that is also quite worrying is that most of the time, it is completely forgotten that one must start from preference of previous experimental information in either of the two approaches, based on the ligand or based on the structure, for example, using the crystallographic structure of the PDB of a pharmacological target without taking into account the minimum essential on the structural characteristics, as the conformation of the amino acids residues rotables, the tautomeric state, etc., which are elementary chemical characteristics to begin for example, a docking study.

Finally, I must say that it is the first time that I read an article like this, which worries about the use and direction that the CADD has taken in recent years, turning it into the mechanical execution of calculations (in black boxes) that does not have contributed as would be expected to the advance in pharmacology, because based on the advances in the existing area and the computational power that can be counted on, it could take advantage of much greater benefit, since all of us who dedicate ourselves to this, we did it rationally and appropriately.
Thank you very much for sharing this article that I consider to be of utmost importance for those who are beginning to go through the CADD, also for those of us who have had a little experience and who on some occasions have made basic errors that have impacted on the conclusions or on the discussion of our research.

I believe that it will also be very useful for those who have extensive experience in the area, as it will surely awaken the interest in preparing and motivating beginners to use the CADD tools always aimed to answer relevant questions and objectives within the investigation.

Let me write some points of view once I have read the article in its entirety:

I believe that the main problem is that most of those who are dedicated to this area of chemistry, we start without any formal academic preparation and that the first calculations that we develop can lead us to generate systematic errors, becoming a mechanical task without knowing the deep of the fundamental principles about how a CADD tool or method works, causing us to take them as “promising” results (and other adjectives used arbitrarily) when in fact they are not.I have always kept in mind that the “results” that we obtain from CADD are predictions of variables that may or may not be observable through an experiment, and that these predictions are always subject to other variables that are not included in a CADD method or tool due to its nature and limitations.

Regarding Table 1:
I confess that the concepts of theoretical chemistry, molecular modelling and chemonformatics, personally, had been completely confused for a long time due to the lack of academic training in CADD (although this is not justification).

I am very surprised by the way and the context in which the erroneous concepts are established, since they seem to me to be very basic errors and I realize that at least I have not pronounced them or that I have not had those ideas, however, at the time of reading ( especially in social networks) to many students and users of CADD tools, I realize that this is the case, as shown in Table 1, it seems that CADD is a quick way to establish associations, relationships and a way to draw conclusions that are "safe and plausible."

Another error that I have frequently seen is that sometimes it is pronounced or taken for granted that the "results" obtained indicate that this or that chemical compound will be a good candidate to be a drug, some even dare to call it "inhibitor ” when there is not even experimental information that has demonstrated that biological activity.

Another error that is also quite worrying is that most of the time, it is completely forgotten that one must start from preference of previous experimental information in either of the two approaches, based on the ligand or based on the structure, for example, using the crystallographic structure of the PDB of a pharmacological target without taking into account the minimum essential on the structural characteristics, as the conformation of the amino acids residues rotables, the tautomeric state, etc., which are elementary chemical characteristics to begin for example, a docking study.

Finally, I must say that it is the first time that I read an article like this, which worries about the use and direction that the CADD has taken in recent years, turning it into the mechanical execution of calculations (in black boxes) that does not have contributed as would be expected to the advance in pharmacology, because based on the advances in the existing area and the computational power that can be counted on, it could take advantage of much greater benefit, since all of us who dedicate ourselves to this, we did it rationally and appropriately.
Competing Interests: No competing interests were disclosed Close
Report a concern
Reader Comment 25 May 2021

Maurizio Recanatini, University of Bologna, Dept. of Pharmacy and Biothechnology, Italy

25 May 2021

Reader Comment

This paper makes the point in a precise and very appropriate way regarding the use and misuse of computational tools in drug discovery. I agree 100%, and thank the authors ... Continue reading This paper makes the point in a precise and very appropriate way regarding the use and misuse of computational tools in drug discovery. I agree 100%, and thank the authors for their commitment to promote rigor in using the methods and also the expressions related to the field (Table 1 is great).
This paper makes the point in a precise and very appropriate way regarding the use and misuse of computational tools in drug discovery. I agree 100%, and thank the authors for their commitment to promote rigor in using the methods and also the expressions related to the field (Table 1 is great).
Competing Interests: No competing interests were disclosed. Close
Report a concern
Reader Comment 25 May 2021

Rodrigo Gutierrez, Facultad de Química, UNAM, Mexico

25 May 2021

Reader Comment

I agree with the points mentioned in the review. Firstly, as in every research project, a bibliographic search must be made in order to get the right concepts without mixing ... Continue reading I agree with the points mentioned in the review. Firstly, as in every research project, a bibliographic search must be made in order to get the right concepts without mixing meanings, this will avoid published texts with wrong terms. Furthermore, the thought that computational tools are not correct because of the lack of experimental validation must be changed, due to that without computational processes, maybe today a lot of approved drugs or vaccines would not exist, specially the one against COVID-19. CADD is a very useful tool in drug design process, it helps it to reduce time and save money. There is a point I would like to make emphasis and is the facility to use tools that help drug design process, because if these tools are not used in a proper way, the results obtained in the process must be affected, and instead of being helpful tools these will be damaging. Since the beginning of the pandemic caused by SARS-CoV-2 a lot of articles proposing new drug candidates have been written, confirming the easy and fast use of these tools, but the hype of finding active compounds against SARS-CoV-2 made people who had access to the tools write articles with lack of information and rationality. As the authors mentioned, this kind of actions devalues the computational methods because wrong information is rolling on the internet. To conclude, I would like to emphasize the point that is crucial not only in the scientific field, but in the daily life, and is to “seek supervision or advice from experts and do not hesitate to ask”, if every person follows this advice, many accidents would be prevented, and science would be more advanced.
I agree with the points mentioned in the review. Firstly, as in every research project, a bibliographic search must be made in order to get the right concepts without mixing meanings, this will avoid published texts with wrong terms. Furthermore, the thought that computational tools are not correct because of the lack of experimental validation must be changed, due to that without computational processes, maybe today a lot of approved drugs or vaccines would not exist, specially the one against COVID-19. CADD is a very useful tool in drug design process, it helps it to reduce time and save money. There is a point I would like to make emphasis and is the facility to use tools that help drug design process, because if these tools are not used in a proper way, the results obtained in the process must be affected, and instead of being helpful tools these will be damaging. Since the beginning of the pandemic caused by SARS-CoV-2 a lot of articles proposing new drug candidates have been written, confirming the easy and fast use of these tools, but the hype of finding active compounds against SARS-CoV-2 made people who had access to the tools write articles with lack of information and rationality. As the authors mentioned, this kind of actions devalues the computational methods because wrong information is rolling on the internet. To conclude, I would like to emphasize the point that is crucial not only in the scientific field, but in the daily life, and is to “seek supervision or advice from experts and do not hesitate to ask”, if every person follows this advice, many accidents would be prevented, and science would be more advanced.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Comment

Author details Author details

¹ DIFACQUIM research group, Department of Pharmacy, School of Pharmacy, Universidad Nacional Autónoma de Méxic, Mexico City, 04510, Mexico
² Instituto de Química, Universidad Nacional Autónoma de Mexico, Mexico City, 04510, Mexico
³ Nanosafety Laboratory, International Iberian Nanotechnology Laboratory, Braga, 4715-330, Portugal
⁴ Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Vienna, 1090, Austria
⁵ Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Bonn, D-53115, Germany

José L. Medina-Franco
Roles: Conceptualization, Writing – Original Draft Preparation, Writing – Review & Editing

Karina Martinez-Mayorga
Roles: Writing – Original Draft Preparation, Writing – Review & Editing

Eli Fernández-de Gortari
Roles: Writing – Original Draft Preparation, Writing – Review & Editing

Johannes Kirchmair
Roles: Writing – Original Draft Preparation, Writing – Review & Editing

Jürgen Bajorath
Roles: Writing – Original Draft Preparation, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

KM-M thanks DGAPA-PASPA for financial support.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Article Versions (1)

version 1

Published: 18 May 2021, 10:397

https://doi.org/10.12688/f1000research.52676.1

Copyright

© 2021 Medina-Franco JL et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Medina-Franco JL, Martinez-Mayorga K, Fernández-de Gortari E et al. Rationality over fashion and hype in drug design [version 1; peer review: 2 approved] F1000Research 2021, 10(Chem Inf Sci):397 (https://doi.org/10.12688/f1000research.52676.1)

NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?

Key to Reviewer Statuses VIEW HIDE

ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Version 1

VERSION 1

PUBLISHED 18 May 2021

Views

44

Reviewer Report 10 Jun 2021

Maria Sorokina, Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller University Jena, Jena, Germany

Approved

https://doi.org/10.5256/f1000research.55986.r86331

The authors discuss the dangers of the growing trend of using computer assisted drug discovery tools without a proper training nor understanding of the fundamental concepts laying behind such approaches. The topic is indeed very important to discuss, and the ... Continue reading

The authors discuss the dangers of the growing trend of using computer assisted drug discovery tools without a proper training nor understanding of the fundamental concepts laying behind such approaches. The topic is indeed very important to discuss, and the past year of working away from the bench and the tremendous amount of publications produced using, often not totally correctly, CADD and other bioinformatic tools and the examples illustrated in Table 1, emphasize this need of reminding that computational tools, although easy to use, still require understanding of the concepts they are built on.

Regarding the sentence ‘Avoid excessive use of buzzwords such as “artificial intelligence” or “machine learning” when they are not applicable, which contributes to inappropriate hype associated with computational methods', I would also add to avoid the usage of “deep learning” another very trendy buzzword, too often misused.

I thank the authors for this article, as such topics are extremely important to be voiced out.

Is the topic of the opinion article discussed accurately in the context of the current literature?

Yes
Are all factual statements correct and adequately supported by citations?

Yes
Are arguments sufficiently supported by evidence from the published literature?

Yes
Are the conclusions drawn balanced and justified on the basis of the presented arguments?

Yes

Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

CITE

Report a concern

Respond or Comment

Views

50

Reviewer Report 19 May 2021

Bruno O. Viloutreix, l’Institut national de la santé et de la recherche médicale (INSERM), Paris, France

Approved

https://doi.org/10.5256/f1000research.55986.r85580

This review about fashion and hype in the field of drug design, written by experts, is indeed very timely and of high interest. The observation applies to many fields related to medicine and biology, to most technologies and even research ... Continue reading

This review about fashion and hype in the field of drug design, written by experts, is indeed very timely and of high interest. The observation applies to many fields related to medicine and biology, to most technologies and even research topics.

This opinion paper should be read by politicians, decision makers, journalists, students, the citizens not involved in research and many scientists. Even more so, it should be read by people defining guidelines for grant applications and by investors that are ready to put millions on a new “in silico” technology that is going to save the world but that has never been published and is only documented on some nice brochures full of buzz words (the so-called proprietary tools that nobody can try or evaluate).

Indeed, nowadays, if a grant or a research paper does not use every two sentences words like game-changer, disruptive, AI, big data, multi-scale, large scale, ultra-large something or quantum something, it has little chance to succeed (in my opinion “quantum whatever” is going to be the next hype after AI, and if you have both, AI + quantum computing or quantum scoring or quantum ADMET or quantum digital twin, then you win the lottery, meaning grants, investors’ money, promotion, medals, prize, fame, and the person will definitively go on TV, get numerous "like" in social networks and become a scientific star invited by politicians to guide the remaining ignorant scientists that do not buy buzz words…).

The complexity is that, in some cases, these methods are going to help, in some others, this is the wind blowing in the forest. Experts in the field know but they are not invited to comment or, if they are, they do not say anything because they are afraid to be considered old school. The reality is of course very different, no problem being enthusiastic about some approaches, about developing "user-friendly" methods, about testing new concepts, but global brainwashing, overselling and propaganda about AI and related are damaging research and in the field of health, these promises will hurt patients, and confuse even more people.

How to stop this given the millions of dollars behind, given global mass brainwashing and the fact that most humans prefer fairy tales than truth? I do not know.

Definitively the present opinion paper can help, the one difficulty I see is that people who should read it will not do so and worse, do not want to hear about such discussion as it may kill their business plan or chance of getting famous. Maybe, if the scientific community that refuses “fashion research” starts to make noise, some changes will occur.

Is the topic of the opinion article discussed accurately in the context of the current literature?

Yes
Are all factual statements correct and adequately supported by citations?

Yes
Are arguments sufficiently supported by evidence from the published literature?

Yes
Are the conclusions drawn balanced and justified on the basis of the presented arguments?

Yes

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: drug discovery, structural bioinformatics, chemoinformatics, molecular medicine

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (7)

Version 1

VERSION 1 PUBLISHED 18 May 2021

Reader Comment 02 Sep 2021

Piotr Minkiewicz, University of Warmia and Mazury, Olsztyn, Poland

02 Sep 2021

Reader Comment

I would like to thank Authors for this publication. I would like also point out that area of applicability of concepts presented here is broader than drug design. They may ... Continue reading I would like to thank Authors for this publication. I would like also point out that area of applicability of concepts presented here is broader than drug design. They may be applied also to in silico research concerning many bioactive components of food. Moreover, table containing misconceptions and correct meaning of cheminformatics concepts is extremely useful as potential part of lectures concerning cheminformatics and bioinformatics.
I would like to thank Authors for this publication. I would like also point out that area of applicability of concepts presented here is broader than drug design. They may be applied also to in silico research concerning many bioactive components of food. Moreover, table containing misconceptions and correct meaning of cheminformatics concepts is extremely useful as potential part of lectures concerning cheminformatics and bioinformatics.
Competing Interests: No competing interests were disclosed Close
Report a concern
Reader Comment 21 Jun 2021

Mohammad Rizki Fadhil Pratama, Universitas Muhammadiyah Palangkaraya, Indonesia

21 Jun 2021

Reader Comment

I will get to the point, I agree with the contents of this version of the manuscript. The COVID-19 pandemic has indeed opened up both good and bad sides of ... Continue reading I will get to the point, I agree with the contents of this version of the manuscript. The COVID-19 pandemic has indeed opened up both good and bad sides of CADD. CADD research is becoming very popular, especially related to COVID-19, but on the other hand, new CADD "experts" have emerged, who unfortunately are not really experienced in this field. Hopefully, this paper can open up readers' insights regarding the "dark" and "light" sides of CADD before actually deciding to delve into it.
I will get to the point, I agree with the contents of this version of the manuscript. The COVID-19 pandemic has indeed opened up both good and bad sides of CADD. CADD research is becoming very popular, especially related to COVID-19, but on the other hand, new CADD "experts" have emerged, who unfortunately are not really experienced in this field. Hopefully, this paper can open up readers' insights regarding the "dark" and "light" sides of CADD before actually deciding to delve into it.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Reader Comment 17 Jun 2021

Marawan null, University of Alberta, Academic, Canada

17 Jun 2021

Reader Comment

Nice read. As someone who is specialized in CADD and molecular modeling, I would admit that the concerns that the author raised are true. What was very awkward to me ... Continue reading Nice read. As someone who is specialized in CADD and molecular modeling, I would admit that the concerns that the author raised are true. What was very awkward to me was the surge of papers claiming to find real SARS-CoV-2 treatments by just doing a bunch of 'button pressings' on molecular docking software. That was really annoying. As someone who learned the field and developed the skills from the bottom, from basic quantum chemistry models, I found myself in a dilemma when I see these papers with crude MM forcefields claiming that they have found a solution to a problem that no one else did. I just wanted to add to this that experimentalists should be also worried. There is no guarantee that what they measure in a lab is the truth. At the best, it is still an approximation to what happens in reality, e.g. in the human body. CADD is like a mining area where people have thrown many fake stones, only an expert can find the real ones that are -unfortunately- very rare.
Nice read. As someone who is specialized in CADD and molecular modeling, I would admit that the concerns that the author raised are true. What was very awkward to me was the surge of papers claiming to find real SARS-CoV-2 treatments by just doing a bunch of 'button pressings' on molecular docking software. That was really annoying. As someone who learned the field and developed the skills from the bottom, from basic quantum chemistry models, I found myself in a dilemma when I see these papers with crude MM forcefields claiming that they have found a solution to a problem that no one else did. I just wanted to add to this that experimentalists should be also worried. There is no guarantee that what they measure in a lab is the truth. At the best, it is still an approximation to what happens in reality, e.g. in the human body. CADD is like a mining area where people have thrown many fake stones, only an expert can find the real ones that are -unfortunately- very rare.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Reader Comment 14 Jun 2021

Filip Miljković, AstraZeneca, Sweden

14 Jun 2021

Reader Comment
Thanks to the authors for compiling a comprehensive list of misconceptions faced in the field of CADD (and related disciplines/sub-disciplines) - I agree with each point made. I suggest the ... Continue reading
Thanks to the authors for compiling a comprehensive list of misconceptions faced in the field of CADD (and related disciplines/sub-disciplines) - I agree with each point made. I suggest the authors to regularly update/publish this piece of work so that CADD practitioners could be constantly reminded to treat their work with scientific rigour. Possibly, if sufficient effort is made, some of these concepts could be implemented by CADD journals and present basis for the manuscript revision. I would add few more points as a suggestion to the authors:

(Published) data is unquestionable and could be used at face value - Combination of outcomes from different experimental techniques requires full attention and greater detail of the data curation process is required to understand if results coming from the processed data are worthy. Often, published data sets are taken at face value and used to benchmark novel/other methods against already published results. Further curation and exploration is required before committing to computational experiments.

Any performance metric is good enough to evaluate my models - Model performance should be evaluated using several different performance metrics and against experimental error estimate or dummy model performance whenever possible.

Performance increase by few points makes my model better than current state-of-the-art - Hardly anyone will accept new approach as worthy if performance increase is only minimal - this might have been something achieved by chance and requires further exploration (i.e. results not worth publishing).

Deep learning is better than conventional methods - This is often not true and conventional machine learning methods should be used instead when no significant performance is achieved by a 'cool new technique'.

Preprint articles are as good as peer-reviewed papers - Preprints should be treated with caution and utilized only after the peer-review process - it may happen that the study gets rejected or changed significantly so that initial claims have little or no value.
Thanks to the authors for compiling a comprehensive list of misconceptions faced in the field of CADD (and related disciplines/sub-disciplines) - I agree with each point made. I suggest the authors to regularly update/publish this piece of work so that CADD practitioners could be constantly reminded to treat their work with scientific rigour. Possibly, if sufficient effort is made, some of these concepts could be implemented by CADD journals and present basis for the manuscript revision. I would add few more points as a suggestion to the authors:

(Published) data is unquestionable and could be used at face value - Combination of outcomes from different experimental techniques requires full attention and greater detail of the data curation process is required to understand if results coming from the processed data are worthy. Often, published data sets are taken at face value and used to benchmark novel/other methods against already published results. Further curation and exploration is required before committing to computational experiments.

Any performance metric is good enough to evaluate my models - Model performance should be evaluated using several different performance metrics and against experimental error estimate or dummy model performance whenever possible.

Performance increase by few points makes my model better than current state-of-the-art - Hardly anyone will accept new approach as worthy if performance increase is only minimal - this might have been something achieved by chance and requires further exploration (i.e. results not worth publishing).

Deep learning is better than conventional methods - This is often not true and conventional machine learning methods should be used instead when no significant performance is achieved by a 'cool new technique'.

Preprint articles are as good as peer-reviewed papers - Preprints should be treated with caution and utilized only after the peer-review process - it may happen that the study gets rejected or changed significantly so that initial claims have little or no value.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Reader Comment 14 Jun 2021

Abel Suárez, Universidad Michoacana de San Nicolás de Hidalgo, Institute for Chemical Biological Research, Mexico

14 Jun 2021

Reader Comment

Thank you very much for sharing this article that I consider to be of utmost importance for those who are beginning to go through the CADD, also for those of ... Continue reading Thank you very much for sharing this article that I consider to be of utmost importance for those who are beginning to go through the CADD, also for those of us who have had a little experience and who on some occasions have made basic errors that have impacted on the conclusions or on the discussion of our research.

I believe that it will also be very useful for those who have extensive experience in the area, as it will surely awaken the interest in preparing and motivating beginners to use the CADD tools always aimed to answer relevant questions and objectives within the investigation.

Let me write some points of view once I have read the article in its entirety:

I believe that the main problem is that most of those who are dedicated to this area of chemistry, we start without any formal academic preparation and that the first calculations that we develop can lead us to generate systematic errors, becoming a mechanical task without knowing the deep of the fundamental principles about how a CADD tool or method works, causing us to take them as “promising” results (and other adjectives used arbitrarily) when in fact they are not.I have always kept in mind that the “results” that we obtain from CADD are predictions of variables that may or may not be observable through an experiment, and that these predictions are always subject to other variables that are not included in a CADD method or tool due to its nature and limitations.

Regarding Table 1:
I confess that the concepts of theoretical chemistry, molecular modelling and chemonformatics, personally, had been completely confused for a long time due to the lack of academic training in CADD (although this is not justification).

I am very surprised by the way and the context in which the erroneous concepts are established, since they seem to me to be very basic errors and I realize that at least I have not pronounced them or that I have not had those ideas, however, at the time of reading ( especially in social networks) to many students and users of CADD tools, I realize that this is the case, as shown in Table 1, it seems that CADD is a quick way to establish associations, relationships and a way to draw conclusions that are "safe and plausible."

Another error that I have frequently seen is that sometimes it is pronounced or taken for granted that the "results" obtained indicate that this or that chemical compound will be a good candidate to be a drug, some even dare to call it "inhibitor ” when there is not even experimental information that has demonstrated that biological activity.

Another error that is also quite worrying is that most of the time, it is completely forgotten that one must start from preference of previous experimental information in either of the two approaches, based on the ligand or based on the structure, for example, using the crystallographic structure of the PDB of a pharmacological target without taking into account the minimum essential on the structural characteristics, as the conformation of the amino acids residues rotables, the tautomeric state, etc., which are elementary chemical characteristics to begin for example, a docking study.

Finally, I must say that it is the first time that I read an article like this, which worries about the use and direction that the CADD has taken in recent years, turning it into the mechanical execution of calculations (in black boxes) that does not have contributed as would be expected to the advance in pharmacology, because based on the advances in the existing area and the computational power that can be counted on, it could take advantage of much greater benefit, since all of us who dedicate ourselves to this, we did it rationally and appropriately.
Thank you very much for sharing this article that I consider to be of utmost importance for those who are beginning to go through the CADD, also for those of us who have had a little experience and who on some occasions have made basic errors that have impacted on the conclusions or on the discussion of our research.

I believe that it will also be very useful for those who have extensive experience in the area, as it will surely awaken the interest in preparing and motivating beginners to use the CADD tools always aimed to answer relevant questions and objectives within the investigation.

Let me write some points of view once I have read the article in its entirety:

I believe that the main problem is that most of those who are dedicated to this area of chemistry, we start without any formal academic preparation and that the first calculations that we develop can lead us to generate systematic errors, becoming a mechanical task without knowing the deep of the fundamental principles about how a CADD tool or method works, causing us to take them as “promising” results (and other adjectives used arbitrarily) when in fact they are not.I have always kept in mind that the “results” that we obtain from CADD are predictions of variables that may or may not be observable through an experiment, and that these predictions are always subject to other variables that are not included in a CADD method or tool due to its nature and limitations.

Regarding Table 1:
I confess that the concepts of theoretical chemistry, molecular modelling and chemonformatics, personally, had been completely confused for a long time due to the lack of academic training in CADD (although this is not justification).

I am very surprised by the way and the context in which the erroneous concepts are established, since they seem to me to be very basic errors and I realize that at least I have not pronounced them or that I have not had those ideas, however, at the time of reading ( especially in social networks) to many students and users of CADD tools, I realize that this is the case, as shown in Table 1, it seems that CADD is a quick way to establish associations, relationships and a way to draw conclusions that are "safe and plausible."

Another error that I have frequently seen is that sometimes it is pronounced or taken for granted that the "results" obtained indicate that this or that chemical compound will be a good candidate to be a drug, some even dare to call it "inhibitor ” when there is not even experimental information that has demonstrated that biological activity.

Another error that is also quite worrying is that most of the time, it is completely forgotten that one must start from preference of previous experimental information in either of the two approaches, based on the ligand or based on the structure, for example, using the crystallographic structure of the PDB of a pharmacological target without taking into account the minimum essential on the structural characteristics, as the conformation of the amino acids residues rotables, the tautomeric state, etc., which are elementary chemical characteristics to begin for example, a docking study.

Finally, I must say that it is the first time that I read an article like this, which worries about the use and direction that the CADD has taken in recent years, turning it into the mechanical execution of calculations (in black boxes) that does not have contributed as would be expected to the advance in pharmacology, because based on the advances in the existing area and the computational power that can be counted on, it could take advantage of much greater benefit, since all of us who dedicate ourselves to this, we did it rationally and appropriately.
Competing Interests: No competing interests were disclosed Close
Report a concern
Reader Comment 25 May 2021

Maurizio Recanatini, University of Bologna, Dept. of Pharmacy and Biothechnology, Italy

25 May 2021

Reader Comment

This paper makes the point in a precise and very appropriate way regarding the use and misuse of computational tools in drug discovery. I agree 100%, and thank the authors ... Continue reading This paper makes the point in a precise and very appropriate way regarding the use and misuse of computational tools in drug discovery. I agree 100%, and thank the authors for their commitment to promote rigor in using the methods and also the expressions related to the field (Table 1 is great).
This paper makes the point in a precise and very appropriate way regarding the use and misuse of computational tools in drug discovery. I agree 100%, and thank the authors for their commitment to promote rigor in using the methods and also the expressions related to the field (Table 1 is great).
Competing Interests: No competing interests were disclosed. Close
Report a concern
Reader Comment 25 May 2021

Rodrigo Gutierrez, Facultad de Química, UNAM, Mexico

25 May 2021

Reader Comment

I agree with the points mentioned in the review. Firstly, as in every research project, a bibliographic search must be made in order to get the right concepts without mixing ... Continue reading I agree with the points mentioned in the review. Firstly, as in every research project, a bibliographic search must be made in order to get the right concepts without mixing meanings, this will avoid published texts with wrong terms. Furthermore, the thought that computational tools are not correct because of the lack of experimental validation must be changed, due to that without computational processes, maybe today a lot of approved drugs or vaccines would not exist, specially the one against COVID-19. CADD is a very useful tool in drug design process, it helps it to reduce time and save money. There is a point I would like to make emphasis and is the facility to use tools that help drug design process, because if these tools are not used in a proper way, the results obtained in the process must be affected, and instead of being helpful tools these will be damaging. Since the beginning of the pandemic caused by SARS-CoV-2 a lot of articles proposing new drug candidates have been written, confirming the easy and fast use of these tools, but the hype of finding active compounds against SARS-CoV-2 made people who had access to the tools write articles with lack of information and rationality. As the authors mentioned, this kind of actions devalues the computational methods because wrong information is rolling on the internet. To conclude, I would like to emphasize the point that is crucial not only in the scientific field, but in the daily life, and is to “seek supervision or advice from experts and do not hesitate to ask”, if every person follows this advice, many accidents would be prevented, and science would be more advanced.
I agree with the points mentioned in the review. Firstly, as in every research project, a bibliographic search must be made in order to get the right concepts without mixing meanings, this will avoid published texts with wrong terms. Furthermore, the thought that computational tools are not correct because of the lack of experimental validation must be changed, due to that without computational processes, maybe today a lot of approved drugs or vaccines would not exist, specially the one against COVID-19. CADD is a very useful tool in drug design process, it helps it to reduce time and save money. There is a point I would like to make emphasis and is the facility to use tools that help drug design process, because if these tools are not used in a proper way, the results obtained in the process must be affected, and instead of being helpful tools these will be damaging. Since the beginning of the pandemic caused by SARS-CoV-2 a lot of articles proposing new drug candidates have been written, confirming the easy and fast use of these tools, but the hype of finding active compounds against SARS-CoV-2 made people who had access to the tools write articles with lack of information and rationality. As the authors mentioned, this kind of actions devalues the computational methods because wrong information is rolling on the internet. To conclude, I would like to emphasize the point that is crucial not only in the scientific field, but in the daily life, and is to “seek supervision or advice from experts and do not hesitate to ask”, if every person follows this advice, many accidents would be prevented, and science would be more advanced.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Comment

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2
Version 1 18 May 21	read	read

Bruno O. Viloutreix, l’Institut national de la santé et de la recherche médicale (INSERM), Paris, France
Maria Sorokina, Friedrich-Schiller University Jena, Jena, Germany

Comments on this article

All Comments(7)

Add a comment

Sign up for content alerts

Browse by related subjects

Back to all reports

Reviewer Report

44 Views

10 Jun 2021 | for Version 1

Maria Sorokina, Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller University Jena, Jena, Germany

44 Views Cite this report Responses(0)

Approved

The authors discuss the dangers of the growing trend of using computer assisted drug discovery tools without a proper training nor understanding of the fundamental concepts laying behind such approaches. The topic is indeed very important to discuss, and the past year of working away from the bench and the tremendous amount of publications produced using, often not totally correctly, CADD and other bioinformatic tools and the examples illustrated in Table 1, emphasize this need of reminding that computational tools, although easy to use, still require understanding of the concepts they are built on.

Regarding the sentence ‘Avoid excessive use of buzzwords such as “artificial intelligence” or “machine learning” when they are not applicable, which contributes to inappropriate hype associated with computational methods', I would also add to avoid the usage of “deep learning” another very trendy buzzword, too often misused.

I thank the authors for this article, as such topics are extremely important to be voiced out.

Is the topic of the opinion article discussed accurately in the context of the current literature?

Yes
Are all factual statements correct and adequately supported by citations?

Yes
Are arguments sufficiently supported by evidence from the published literature?

Yes
Are the conclusions drawn balanced and justified on the basis of the presented arguments?

Yes

Competing Interests

No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

50 Views

19 May 2021 | for Version 1

Bruno O. Viloutreix, l’Institut national de la santé et de la recherche médicale (INSERM), Paris, France

50 Views Cite this report Responses(0)

Approved

This review about fashion and hype in the field of drug design, written by experts, is indeed very timely and of high interest. The observation applies to many fields related to medicine and biology, to most technologies and even research topics.

This opinion paper should be read by politicians, decision makers, journalists, students, the citizens not involved in research and many scientists. Even more so, it should be read by people defining guidelines for grant applications and by investors that are ready to put millions on a new “in silico” technology that is going to save the world but that has never been published and is only documented on some nice brochures full of buzz words (the so-called proprietary tools that nobody can try or evaluate).

Indeed, nowadays, if a grant or a research paper does not use every two sentences words like game-changer, disruptive, AI, big data, multi-scale, large scale, ultra-large something or quantum something, it has little chance to succeed (in my opinion “quantum whatever” is going to be the next hype after AI, and if you have both, AI + quantum computing or quantum scoring or quantum ADMET or quantum digital twin, then you win the lottery, meaning grants, investors’ money, promotion, medals, prize, fame, and the person will definitively go on TV, get numerous "like" in social networks and become a scientific star invited by politicians to guide the remaining ignorant scientists that do not buy buzz words…).

The complexity is that, in some cases, these methods are going to help, in some others, this is the wind blowing in the forest. Experts in the field know but they are not invited to comment or, if they are, they do not say anything because they are afraid to be considered old school. The reality is of course very different, no problem being enthusiastic about some approaches, about developing "user-friendly" methods, about testing new concepts, but global brainwashing, overselling and propaganda about AI and related are damaging research and in the field of health, these promises will hurt patients, and confuse even more people.

How to stop this given the millions of dollars behind, given global mass brainwashing and the fact that most humans prefer fairy tales than truth? I do not know.

Definitively the present opinion paper can help, the one difficulty I see is that people who should read it will not do so and worse, do not want to hear about such discussion as it may kill their business plan or chance of getting famous. Maybe, if the scientific community that refuses “fashion research” starts to make noise, some changes will occur.

Is the topic of the opinion article discussed accurately in the context of the current literature?

Yes
Are all factual statements correct and adequately supported by citations?

Yes
Are arguments sufficiently supported by evidence from the published literature?

Yes
Are the conclusions drawn balanced and justified on the basis of the presented arguments?

Yes

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

drug discovery, structural bioinformatics, chemoinformatics, molecular medicine

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

[1] 1. Gasteiger J: Chemistry in Times of Artificial Intelligence. ChemPhysChem. 2020; 21(20): 2233–2242. PubMed Abstract | Publisher Full Text | Free Full Text

[2] 2. Singh N, Chaput L, Villoutreix BO: Virtual Screening Web Servers: Designing Chemical Probes and Drug Candidates in the Cyberspace. Brief. Bioinform. 2021; 22(2): 1790–1818. PubMed Abstract | Publisher Full Text | Free Full Text

[3] 3. Willems H, De Cesco S, Svensson F: Computational Chemistry on a Budget: Supporting Drug Discovery with Limited Resources. J. Med. Chem. 2020; 63(18): 10158–10169. PubMed Abstract | Publisher Full Text

[4] 4. Click2Drug: Accessed Apr 7, 2021.Reference Source

[5] 5. Macs in Chemistry: Accessed Apr 7, 2021.Reference Source

[6] 6. Bajorath J: Progress in Computational Medicinal Chemistry. J. Med. Chem. 2012; 55(8): 3593–3594. PubMed Abstract | Publisher Full Text

[7] 7. Merz KM Jr, Amaro R, Cournia Z, et al.: Editorial: Method and Data Sharing and Reproducibility of Scientific Results. J. Chem. Inf. Model. 2020: 60(12): 5868–5869. PubMed Abstract | Publisher Full Text

[8] 8. Muratov EN, Bajorath J, Sheridan RP, et al.: QSAR without Borders. Chem. Soc. Rev. 2020; 49(11): 3525–3564. PubMed Abstract | Publisher Full Text | Free Full Text

[9] 9. Fourches D, Muratov E, Tropsha A: Trust, but Verify II: A Practical Guide to Chemogenomics Data Curation. J. Chem. Inf. Model. 2016; 56(7): 1243–1252. PubMed Abstract | Publisher Full Text | Free Full Text

[10] 10. Temml V, Schuster D: Molecular Docking for Natural Product Investigations: Pitfalls and Ways to Overcome Them. In: Molecular Docking for Computer-Aided Drug Design. Elsevier; 2021; pp 391–405.

[11] 11. Scior T: Do It Yourself—Dock It Yourself: General Concepts and Practical Considerations for Beginners to Start Molecular Ligand–Target Docking Simulations. In: Molecular Docking for Computer-Aided Drug Design. Elsevier; 2021; pp 205–227.

[12] 12. Scior T, Bender A, Tresadern G, et al.: Recognizing Pitfalls in Virtual Screening: A Critical Review. J. Chem. Inf. Model. 2012; 52(4): 867–881. PubMed Abstract | Publisher Full Text

[13] 13. Varnek A, Baskin II: Chemoinformatics as a Theoretical Chemistry Discipline. Mol. Inform. 2011; 30(1): 20–32. PubMed Abstract | Publisher Full Text

[14] 14. López-López E, Bajorath J, Medina-Franco JL: Informatics for Chemistry, Biology, and Biomedical Sciences. J. Chem. Inf. Model. 2021; 61(1): 26–35. PubMed Abstract | Publisher Full Text

[15] 15. Lipinski CA: Lead- and Drug-like Compounds: The Rule-of-Five Revolution. Drug Discov. Today Technol. 2004; 1(4): 337–341. PubMed Abstract | Publisher Full Text

Rationality over fashion and hype in drug design

Abstract

Keywords

Computer-aided drug discovery

CADD and related fields

Common misconceptions and false perceptions when CADD is superficially viewed

Table 1. Examples of misconceptions versus the intended use accepted by experts in the field.

When is a study “complete”?

Using methods for the right reasons

General recommendations for the proper use of CADD resources

Concluding remarks

Data availability

Acknowledgments

References

Comments on this article Comments (7)

Open Peer Review

Comments on this article Comments (7)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated