Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions

Klarner, Leo; Rudner, Tim G. J.; Reutlinger, Michael; Schindler, Torsten; Morris, Garrett M.; Deane, Charlotte; Teh, Yee Whye

Quantitative Biology > Biomolecules

arXiv:2307.15073 (q-bio)

[Submitted on 14 Jul 2023]

Title:Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions

Authors:Leo Klarner, Tim G. J. Rudner, Michael Reutlinger, Torsten Schindler, Garrett M. Morris, Charlotte Deane, Yee Whye Teh

View PDF

Abstract:Accelerating the discovery of novel and more effective therapeutics is an important pharmaceutical problem in which deep learning is playing an increasingly significant role. However, real-world drug discovery tasks are often characterized by a scarcity of labeled data and significant covariate shift$\unicode{x2013}\unicode{x2013}$a setting that poses a challenge to standard deep learning methods. In this paper, we present Q-SAVI, a probabilistic model able to address these challenges by encoding explicit prior knowledge of the data-generating process into a prior distribution over functions, presenting researchers with a transparent and probabilistically principled way to encode data-driven modeling preferences. Building on a novel, gold-standard bioactivity dataset that facilitates a meaningful comparison of models in an extrapolative regime, we explore different approaches to induce data shift and construct a challenging evaluation setup. We then demonstrate that using Q-SAVI to integrate contextualized prior knowledge of drug-like chemical space into the modeling process affords substantial gains in predictive accuracy and calibration, outperforming a broad range of state-of-the-art self-supervised pre-training and domain adaptation techniques.

Comments:	Published in the Proceedings of the 40th International Conference on Machine Learning (ICML 2023)
Subjects:	Biomolecules (q-bio.BM); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2307.15073 [q-bio.BM]
	(or arXiv:2307.15073v1 [q-bio.BM] for this version)
	https://doi.org/10.48550/arXiv.2307.15073

Submission history

From: Tim G. J. Rudner [view email]
[v1] Fri, 14 Jul 2023 05:01:10 UTC (1,849 KB)

Quantitative Biology > Biomolecules

Title:Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Quantitative Biology > Biomolecules

Title:Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators