Abstract
Patents from medicinal chemistry represent a rich source of novel compounds and activity data that appear only infrequently in the scientific literature. Moreover, patent information provides a primary focal point for drug discovery. Accordingly, text mining and image extraction approaches have become hot topics in patent analysis and repositories of patent data are being established. In this work, we have generated network representations using alternative similarity measures to systematically compare molecules from patents with other bioactive compounds, visualize similarity relationships, explore the chemical neighbourhood of patent molecules, and identify closely related compounds with different activities. The design of network representations that combine patent molecules and other bioactive compounds and view patent information in the context of current bioactive chemical space aids in the analysis of patents and further extends the use of molecular networks to explore structure–activity relationships.
Similar content being viewed by others
References
Southan C (2015) Expanding opportunities for mining bioactive chemistry from patents. Drug Discov Today 14:3–9
Bregonje M (2005) Patents: a unique source for scientific technical information in chemistry related industry? World Patent Inf 27:309–315
Banville DL (2006) Mining chemical structural information from the drug literature. Drug Discov Today 11:35–42
Jessop DM, Adams SE, Murray-Rust P (2011) Mining chemical information from open patents. J Chem Inf 3:40
Vazquez M, Krallinger M, Leitner F, Valencia A (2011) Text mining for drugs and chemical compounds: methods, tools and applications. Mol Inf 30:506–519
Downs GM, Barnard JM (2011) Wiley Interdisc Rev Comput Mol Sci 1:727–741
Papadatos G, Davies M, Dedman N, Chambers J, Gaulton A, Siddle J, Koks R, Irvine SA, Pettersson J, Goncharoff N, Hersey A, Overington JP (2015) SureChEMBL: a large-scale, chemically annotated patent document database. Nucleic Acids Res 44:D1220-D1228
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40:D1100-D1107
Rhodes J, Boyer S, Kreulen J, Chen Y, Ordonez P (2007) Mining patents using molecular similarity search. Biocomputing 2007:304–315
Maggiora GM, Bajorath J (2014) Chemical space networks—a powerful new paradigm for the description of chemical space. J Comput-Aided Mol Des 28:795–802
Vogt M, Stumpfe D, Maggiora GM, Bajorath J (2016) Lessons learned from the design of chemical space networks and opportunities for new applications. J Comput-Aided Mol Des 30:191–208
Maggiora GM, Vogt M, Stumpfe D, Bajorath J (2014) Molecular similarity in medicinal chemistry. J Med Chem 57:3186–3204
Newman M (2010) Networks—an introduction, Oxford University Press Inc., New York
Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. ICWSM 8:361–362
Fruchterman TMJ, Reingold EM (1991) Graph drawing by force-directed placement. Softw—Pract Experience 21:1129–1164
Zhang B, Vogt M, Maggiora GM, Bajorath J (2015) Comparison of bioactive chemical space networks generated using substructure- and fingerprint-based measures of molecular similarity. J Comput-Aided Mol Des 29:595–608
Kenny PW, Sadowski J (2005) Structure modification in chemical databases. In: Oprea TI (ed) Chemo informatics in drug discovery, Wiley, Weinheim, 271–285
Hussain J, Rea C (2010) Computationally efficient algorithm to identify matched molecular pairs (MMPs) in large data sets. J Chem Inf Model 50:339–348
Hu X, Hu Y, Vogt M, Stumpfe D, Bajorath J (2012) MMP-cliffs: systematic identification of activity cliffs on the basis of matched molecular pairs. J Chem Inf Model 52:1138–1145
Zhang B, Vogt M, Maggiora GM, Bajorath J (2015) Design of chemical space networks using a Tanimoto similarity variant based upon maximum common substructures. J Comput-Aided Mol Des 29:937–950
OEChem TK version 2.0.0; OpenEye Scientific Software, Santa Fe, NM, http://www.eyesopen.com
Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J (2012) DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 34:D668-D672
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kunimoto, R., Bajorath, J. Exploring sets of molecules from patents and relationships to other active compounds in chemical space networks. J Comput Aided Mol Des 31, 779–788 (2017). https://doi.org/10.1007/s10822-017-0061-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-017-0061-2