Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

Reproducible molecular networking of untargeted mass spectrometry data using GNPS

Abstract

Global Natural Product Social Molecular Networking (GNPS) is an interactive online small molecule–focused tandem mass spectrometry (MS2) data curation and analysis infrastructure. It is intended to provide as much chemical insight as possible into an untargeted MS2 dataset and to connect this chemical insight to the user’s underlying biological questions. This can be performed within one liquid chromatography (LC)-MS2 experiment or at the repository scale. GNPS-MassIVE is a public data repository for untargeted MS2 data with sample information (metadata) and annotated MS2 spectra. These publicly accessible data can be annotated and updated with the GNPS infrastructure keeping a continuous record of all changes. This knowledge is disseminated across all public data; it is a living dataset. Molecular networking—one of the main analysis tools used within the GNPS platform—creates a structured data table that reflects the molecular diversity captured in tandem mass spectrometry experiments by computing the relationships of the MS2 spectra as spectral similarity. This protocol provides step-by-step instructions for creating reproducible, high-quality molecular networks. For training purposes, the reader is led through a 90- to 120-min procedure that starts by recalling an example public dataset and its sample information and proceeds to creating and interpreting a molecular network. Each data analysis job can be shared or cloned to disseminate the knowledge gained, thus propagating information that can lead to the discovery of molecules, metabolic pathways, and ecosystem/community interactions.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Schematic representation of the process for creating a molecular network from tandem mass spectra acquired for metabolites in complex sample mixtures.
Fig. 2: Flowchart of the protocol, delineating the workflow through Step 32 (Steps 33–53 address optional analyses, visualizations and sharing of data and molecular networks).
Fig. 3: Mouse duodenum global molecular network created from MassIVE dataset and visualized in Cytoscape.
Fig. 4
Fig. 5
Fig. 6: Propagation of molecular networking to discover relationships between molecules.
Fig. 7: Networking of the stenothricin natural product molecular family (MSV000083381) detected in Streptomyces sp. DSM5940 (purple nodes), S. roseosporus NRRL 15998 (green nodes) or both strains (yellow nodes).
Fig. 8: Molecular family (a subnetwork) of quinolones detected in lung tissue extracts and cultured Pseudomonas isolates created from MassIVE dataset MSV000083359.

Similar content being viewed by others

Data availability

All LC–MS data used in this paper are publicly available at the GNPS-MassIVE repository under the following accession numbers.

MSV000083437 (GF and SPF mice, data not shown)

MSV000083359 (3D cartography of diseased human lung47)

MSV000083381 (stenothricin-GNPS analogs11)

References

  1. Watrous, J. et al. Mass spectral molecular networking of living microbial colonies. Proc. Natl Acad. Sci. USA 109, E1743–E1752 (2012).

    Article  CAS  PubMed  Google Scholar 

  2. Traxler, M. F. & Kolter, R. A massively spectacular view of the chemical lives of microbes. Proc. Natl Acad. Sci. USA 109, 10128–10129 (2012).

    Article  CAS  PubMed  Google Scholar 

  3. Fox Ramos, A. E., Evanno, L., Poupon, E., Champy, P. & Beniddir, M. A. Natural products targeting strategies involving molecular networking: different manners, one goal. Nat. Prod. Rep. 36, 960–980 (2019).

    Article  CAS  PubMed  Google Scholar 

  4. Teta, R. et al. A joint molecular networking study of a Smenospongia sponge and a cyanobacterial bloom revealed new antiproliferative chlorinated polyketides. Org. Chem. Front 6, 1762–1774 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Kalinski, J. J. et al. Molecular networking reveals two distinct chemotypes in pyrroloiminoquinone-producing Tsitsikamma favus sponges. Mar. Drugs 17, 60 (2019).

    Article  CAS  PubMed Central  Google Scholar 

  6. Raheem, D. J., Tawfike, A. F., Abdelmohsen, U. R., Edrada-Ebel, R. & Fitzsimmons-Thoss, V. Application of metabolomics and molecular networking in investigating the chemical profile and antitrypanosomal activity of British bluebells (Hyacinthoides non-scripta). Sci. Rep. 9, 2547 (2019).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  7. Trautman, E. P., Healy, A. R., Shine, E. E., Herzon, S. B. & Crawford, J. M. Domain-targeted metabolomics delineates the heterocycle assembly steps of colibactin biosynthesis. J. Am. Chem. Soc. 139, 4195–4201 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Vizcaino, M. I., Engel, P., Trautman, E. & Crawford, J. M. Comparative metabolomics and structural characterizations illuminate colibactin pathway-dependent small molecules. J. Am. Chem. Soc. 136, 9244–9247 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Nguyen, D. D. et al. Indexing the Pseudomonas specialized metabolome enabled the discovery of poaeamide B and the bananamides. Nat. Microbiol. 2, 16197 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Frank, A. M. et al. Clustering millions of tandem mass spectra. J. Proteome Res. 7, 113–122 (2008).

    Article  CAS  PubMed  Google Scholar 

  11. Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Frank, A. M. et al. Spectral archives: extending spectral libraries to analyze both identified and unidentified spectra. Nat. Methods 8, 587–591 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. De Vijlder, T. et al. A tutorial in small molecule identification via electrospray ionization-mass spectrometry: the practical art of structural elucidation. Mass Spectrom. Rev. 37, 607–629 (2018).

    Article  PubMed  CAS  Google Scholar 

  14. Artyukhin, A. B. et al. Metabolomic “dark matter” dependent on peroxisomal β-oxidation in Caenorhabditis elegans. J. Am. Chem. Soc. 140, 2841–2852 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Edwards, E. D., Woolly, E. F., McLellan, R. M. & Keyzers, R. A. Non-detection of honeybee hive contamination following Vespula wasp baiting with protein containing fipronil. PLoS One 13, e0206385 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  16. Hoffmann, T. et al. Correlating chemical diversity with taxonomic distance for discovery of natural products in myxobacteria. Nat. Commun. 9, 803 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  17. Leipoldt, F. et al. Warhead biosynthesis and the origin of structural diversity in hydroxamate metalloproteinase inhibitors. Nat. Commun. 8, 1965 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  18. Kang, K. B., Gao, M., Kim, G. J., Choi, H. & Sung, S. H. Rhamnellosides A and B, omega-phenylpentaene fatty acid amide diglycosides from the fruits of Rhamnella franguloides. Molecules 23, 752 (2018).

    Article  PubMed Central  CAS  Google Scholar 

  19. Remy, S. et al. Structurally diverse diterpenoids from Sandwithia guyanensis. J. Nat. Prod. 81, 901–912 (2018).

    Article  CAS  PubMed  Google Scholar 

  20. Riewe, D., Wiebach, J. & Altmann, T. Structure annotation and quantification of wheat seed oxidized lipids by high-resolution LC-MS/MS. Plant Physiol. 175, 600–618 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Senges, C. H. R. et al. The secreted metabolome of Streptomyces chartreusis and implications for bacterial chemistry. Proc. Natl Acad. Sci. USA 115, 2490–2495 (2018).

    Article  CAS  PubMed  Google Scholar 

  22. van der Hooft, J. J. J. et al. Unsupervised discovery and comparison of structural families across multiple samples in untargeted metabolomics. Anal. Chem. 89, 7569–7577 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  23. Wolff, H. & Bode, H. B. The benzodiazepine-like natural product tilivalline is produced by the entomopathogenic bacterium Xenorhabdus eapokensis. PLoS One 13, e0194297 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  24. Schymanski, E. L. et al. Critical assessment of small molecule identification 2016: automated methods. J. Cheminf. 9, 22 (2017).

    Article  Google Scholar 

  25. Beniddir, M. MTBLS142: collected tandem mass spectrometry data on monoterpene indole alkaloids from natural product chemistry research. MetaboLights https://www.ebi.ac.uk/metabolights/MTBLS142 (2018).

  26. Lei, Z. et al. Construction of an ultrahigh pressure liquid chromatography-tandem mass spectral library of plant natural products and comparative spectral analyses. Anal. Chem. 87, 7373–7381 (2015).

    Article  CAS  PubMed  Google Scholar 

  27. Nikolic, D., Jones, M., Sumner, L. & Dunn, W. CASMI 2014: challenges, solutions and results. Curr. Metab. 5, 5–17 (2017).

    Article  CAS  Google Scholar 

  28. Horai, H. et al. MassBank: a public repository for sharing mass spectral data for life sciences. J. Mass Spectrom. 45, 703–714 (2010).

    Article  CAS  PubMed  Google Scholar 

  29. Stravs, M. A., Schymanski, E. L., Singer, H. P. & Hollender, J. Automatic recalibration and processing of tandem mass spectra using formula annotation. J. Mass Spectrom. 48, 89–99 (2013).

    Article  CAS  PubMed  Google Scholar 

  30. von Eckardstein, L. et al. Total synthesis and biological assessment of novel albicidins discovered by mass spectrometric networking. Chemistry 23, 15316–15321 (2017).

    Article  CAS  Google Scholar 

  31. Vizcaino, M. I. & Crawford, J. M. The colibactin warhead crosslinks DNA. Nat. Chem. 7, 411–417 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Saleh, H. et al. Deuterium-labeled precursor feeding reveals a new pABA-containing meroterpenoid from the mango pathogen Xanthomonas citri pv. mangiferaeindicae. J. Nat. Prod. 79, 1532–1537 (2016).

    Article  CAS  PubMed  Google Scholar 

  33. Fox Ramos, A. E. et al. Collected mass spectrometry data on monoterpene indole alkaloids from natural product chemistry research. Sci. Data 6, 15 (2019).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  34. Aron, A. T. et al. Reproducible Molecular networking of untargeted mass spectrometry data using GNPS. Preprint at https://doi.org/10.26434/chemrxiv.9333212.v1 (2019).

  35. Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Petras, D. et al. Mass spectrometry-based visualization of molecules associated with human habitats. Anal. Chem. 88, 10775–10784 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Kapono, C. A. et al. Creating a 3D microbial and chemical snapshot of a human habitat. Sci. Rep. 8, 3669 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  38. Adams, R. I. et al. Microbes and associated soluble and volatile chemicals on periodically wet household surfaces. Microbiome 5, 128 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Petras, D. et al. High-resolution liquid chromatography tandem mass spectrometry enables large scale molecular characterization of dissolved organic matter. Front. Mar. Sci. 4, 405 (2017).

    Article  Google Scholar 

  40. Trautman, E. P. & Crawford, J. M. Linking biosynthetic gene clusters to their metabolites via pathway-targeted molecular networking. Curr. Top. Med. Chem. 16, 1705–1716 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Luzzatto-Knaan, T., Melnik, A. V. & Dorrestein, P. C. Mass spectrometry uncovers the role of surfactin as an interspecies recruitment factor. ACS Chem. Biol. 14, 459–467 (2019).

    Article  CAS  PubMed  Google Scholar 

  42. Machushynets, N. V., Wu, C., Elsayed, S. S., Hankemeier, T. & van Wezel, G. P. Discovery of novel glycerolated quinazolinones from Streptomyces sp. MBT27. J. Ind. Microbiol. Biotechnol. 46, 483–492 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Yao, L. et al. Discovery of novel xylosides in co-culture of basidiomycetes Trametes versicolor and Ganoderma applanatum by integrated metabolomics and bioinformatics. Sci. Rep. 6, 33237 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  44. Tripathi, A. et al. Intermittent hypoxia and hypercapnia, a hallmark of obstructive sleep apnea, alters the gut microbiome and metabolome. mSystems 3, e00020-18 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Smits, S. A. et al. Seasonal cycling in the gut microbiome of the Hadza hunter-gatherers of Tanzania. Science 357, 802–806 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. McDonald, D. et al. American Gut: an open platform for citizen science microbiome research. mSystems 3, e00031-18 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  47. Garg, N. et al. Three-dimensional microbiome and metabolome cartography of a diseased human lung. Cell Host Microbe 22, 705–716 e704 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Edlund, A. et al. Metabolic fingerprints from the human oral microbiome reveal a vast knowledge gap of secreted small peptidic molecules. mSystems 2, e00058-17 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  49. McCall, L. I. et al. Mass spectrometry-based chemical cartography of a cardiac parasitic infection. Anal. Chem. 89, 10414–10421 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Watrous, J. D. et al. Directed non-targeted mass spectrometry and chemical networking for discovery of eicosanoids and related oxylipins. Cell Chem. Biol. 26, 433–442.e4 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Allard, S., Allard, P. M., Morel, I. & Gicquel, T. Application of a molecular networking approach for clinical and forensic toxicology exemplified in three cases involving 3-MeO-PCP, doxylamine, and chlormequat. Drug Test. Anal. 11, 669–677 (2018).

    Article  CAS  Google Scholar 

  52. Ernst, M. et al. Assessing specialized metabolite diversity in the cosmopolitan plant genus Euphorbia L. Front. Plant Sci. 10, 846 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Philippus, A. C. et al. Molecular networking prospection and characterization of terpenoids and C15-acetogenins in Brazilian seaweed extracts. RSC Adv. 8, 29654–29661 (2018).

    Article  CAS  Google Scholar 

  54. Li, F., Janussen, D., Peifer, C., Perez-Victoria, I. & Tasdemir, D. Targeted isolation of tsitsikammamines from the Antarctic deep-sea sponge Latrunculia biformis by molecular networking and anticancer activity. Mar. Drugs 16, 268 (2018).

    Article  PubMed Central  CAS  Google Scholar 

  55. Hartmann, A. C. et al. Meta-mass shift chemical profiling of metabolomes from coral reefs. Proc. Natl Acad. Sci. USA 114, 11685–11690 (2017).

    Article  CAS  PubMed  Google Scholar 

  56. Tobias, N. J. et al. Natural product diversity associated with the nematode symbionts Photorhabdus and Xenorhabdus. Nat. Microbiol. 2, 1676–1685 (2017).

    Article  CAS  PubMed  Google Scholar 

  57. Nothias, L. F. et al. Bioactivity-based molecular networking for the discovery of drug leads in natural product bioassay-guided fractionation. J. Nat. Prod. 81, 758–767 (2018).

    Article  CAS  PubMed  Google Scholar 

  58. Zou, Y. et al. Computationally assisted discovery and assignment of a highly strained and PANC-1 selective alkaloid from Alaska’s deep ocean. J. Am. Chem. Soc. 141, 4338–4344 (2019).

    Article  CAS  PubMed  Google Scholar 

  59. Parkinson, E. I. et al. Discovery of the tyrobetaine natural products and their biosynthetic gene cluster via metabologenomics. ACS Chem. Biol. 13, 1029–1037 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Naman, C. B. et al. Integrating molecular networking and biological assays to target the isolation of a cytotoxic cyclic octapeptide, samoamide A, from an American Samoan marine cyanobacterium. J. Nat. Prod. 80, 625–633 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Bouslimani, A. et al. Lifestyle chemistries from phones for individual profiling. Proc. Natl Acad. Sci. USA 113, E7645–E7654 (2016).

    Article  CAS  PubMed  Google Scholar 

  62. Fox Ramos, A. E. et al. CANPA: computer-assisted natural products anticipation. Anal. Chem. 91, 11247–11252 (2019).

    Article  CAS  PubMed  Google Scholar 

  63. Quinn, R. A. et al. Niche partitioning of a pathogenic microbiome driven by chemical gradients. Sci. Adv. 4, eaau1908 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  64. Aksenov, A. A., da Silva, R., Knight, R., Lopes, N. P. & Dorrestein, P. C. Global chemical analysis of biology by mass spectrometry. Nat. Rev. Chem. 1, 0054 (2017).

    Article  CAS  Google Scholar 

  65. Tsugawa, H. Advances in computational metabolomics and databases deepen the understanding of metabolisms. Curr. Opin. Biotechnol. 54, 10–17 (2018).

    Article  CAS  PubMed  Google Scholar 

  66. Johnson, S. R. & Lange, B. M. Open-access metabolomics databases for natural product research: present capabilities and future potential. Front. Bioeng. Biotechnol. 3, 22 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  67. Haug, K. et al. MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data. Nucleic Acids Res. 41, D781–D786 (2013).

    Article  CAS  PubMed  Google Scholar 

  68. Perez-Riverol, Y. et al. Discovering and linking public omics data sets using the Omics Discovery Index. Nat. Biotechnol. 35, 406–409 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Stein, S. E. & Scott, D. R. Optimization and testing of mass spectral library search algorithms for compound identification. J. Am. Soc. Mass Spectrom. 5, 859–866 (1994).

    Article  CAS  PubMed  Google Scholar 

  70. Mohimani, H. & Pevzner, P. A. Dereplication, sequencing and identification of peptidic natural products: from genome mining to peptidogenomics to spectal networks. Nat. Prod. Rep. 33, 73–86 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Yang, J. Y. et al. Molecular networking as a dereplication strategy. J. Nat. Prod. 76, 1686–1699 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Moorthy, A. S., Wallace, W. E., Kearsley, A. J., Tchekhovskoi, D. V. & Stein, S. E. Combining fragment-ion and neutral-loss matching during mass spectral library searching: a new general purpose algorithm applicable to illicit drug identification. Anal. Chem. 89, 13261–13268 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Klinman, J. P. The multi-functional topa-quinone copper amine oxidases. Biochim. Biophys. Acta 1637, 131–137 (2003).

    Article  CAS  Google Scholar 

  74. Guijas, C. et al. METLIN: a technology platform for identifying knowns and unknowns. Anal. Chem. 90, 3156–3164 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Reddi, A. R. & Culotta, V. C. SOD1 integrates signals from oxygen and glucose to repress respiration. Cell 152, 224–235 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Sheldon, M. T., Mistrik, R. & Croley, T. R. Determination of ion structures in structurally related compounds using precursor ion fingerprinting. J. Am. Soc. Mass Spectrom. 20, 370–376 (2009).

    Article  CAS  PubMed  Google Scholar 

  77. Sawada, Y. et al. RIKEN tandem mass spectral database (ReSpect) for phytochemicals: a plant-specific MS/MS-based data resource and database. Phytochemistry 82, 38–45 (2012).

    Article  CAS  PubMed  Google Scholar 

  78. Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R. & Siuzdak, G. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal. Chem. 78, 779–787 (2006).

    Article  CAS  PubMed  Google Scholar 

  79. Tautenhahn, R., Patti, G. J., Rinehart, D. & Siuzdak, G. XCMS Online: a web-based platform to process untargeted metabolomic data. Anal. Chem. 84, 5035–5039 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Wanichthanarak, K., Fan, S., Grapov, D., Barupal, D. K. & Fiehn, O. Metabox: a toolbox for metabolomic data analysis, interpretation and integrative exploration. PLoS ONE 12, e0171046 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  81. Mohimani, H. et al. Dereplication of microbial metabolites through database search of mass spectra. Nat. Comm. 9, 4035 (2018).

    Article  CAS  Google Scholar 

  82. Mohimani, H. et al. Dereplication of peptidic natural products through database search of mass spectra. Nat. Chem. Biol. 13, 30–37 (2017).

    Article  CAS  PubMed  Google Scholar 

  83. Gurevich, A. et al. Increased diversity of peptidic natural products revealed by modification-tolerant database search of mass spectra. Nat. Microbiol. 3, 319–327 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. da Silva, R. R. et al. Propagating annotations of molecular networks using in silico fragmentation. PLoS Comput. Biol. 14, e1006089 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  85. Mohimani, H. et al. Automated genome mining of ribosomal peptide natural products. ACS Chem. Biol. 9, 1545–1551 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Olivon, F. et al. MetGem software for the generation of molecular networks based on the t-SNE algorithm. Anal. Chem. 90, 13900–13908 (2018).

    Article  CAS  PubMed  Google Scholar 

  87. Olivon, F., Roussi, F., Litaudon, M. & Touboul, D. Optimized experimental workflow for tandem mass spectrometry molecular networking in metabolomics. Anal. Bioanal. Chem. 409, 5767–5778 (2017).

    Article  CAS  PubMed  Google Scholar 

  88. Wehrens, R. et al. Improved batch correction in untargeted MS-based metabolomics. Metabolomics 12, 88 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  89. Koal, T. & Deigner, H. P. Challenges in mass spectrometry based targeted metabolomics. Curr. Mol. Med. 10, 216–226 (2010).

    Article  CAS  PubMed  Google Scholar 

  90. Bylda, C., Thiele, R., Kobold, U. & Volmer, D. A. Recent advances in sample preparation techniques to overcome difficulties encountered during quantitative analysis of small molecules from biofluids using LC-MS/MS. Analyst 139, 2265–2276 (2014).

    Article  CAS  PubMed  Google Scholar 

  91. Vuckovic, D. Current trends and challenges in sample preparation for global metabolomics using liquid chromatography-mass spectrometry. Anal. Bioanal. Chem. 403, 1523–1548 (2012).

    Article  CAS  PubMed  Google Scholar 

  92. Dunn, W. B. et al. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nat. Protoc. 6, 1060 (2011).

    Article  CAS  PubMed  Google Scholar 

  93. Taylor, P. J. Matrix effects: the Achilles heel of quantitative high-performance liquid chromatography-electrospray-tandem mass spectrometry. Clin. Biochem. 38, 328–334 (2005).

    Article  CAS  PubMed  Google Scholar 

  94. Annesley, T. M. Ion suppression in mass spectrometry. Clin. Chem. 49, 1041–1044 (2003).

    Article  CAS  PubMed  Google Scholar 

  95. Crüsemann, M. et al. Prioritizing natural product diversity in a collection of 146 bacterial strains based on growth and extraction protocols. J. Nat. Prod. 80, 588–597 (2017).

    Article  PubMed  CAS  Google Scholar 

  96. Wandro, S., Carmody, L., Gallagher, T., LiPuma, J. J. & Whiteson, K. Making it last: storage time and temperature have differential impacts on metabolite profiles of airway samples from cystic fibrosis patients. mSystems 2, e00100-17 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  97. Zhao, J., Evans, C. R., Carmody, L. A. & LiPuma, J. J. Impact of storage conditions on metabolite profiles of sputum samples from persons with cystic fibrosis. J. Cyst. Fibros. 14, 468–473 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Hirayama, A. et al. Effects of processing and storage conditions on charged metabolomic profiles in blood. Electrophoresis 36, 2148–2155 (2015).

    Article  CAS  PubMed  Google Scholar 

  99. Mushtaq, M. Y., Choi, Y. H., Verpoorte, R. & Wilson, E. G. Extraction for metabolomics: access to the metabolome. Phytochem. Anal. 25, 291–306 (2014).

    Article  CAS  PubMed  Google Scholar 

  100. Scheubert, K. et al. Significance estimation for large scale metabolomics annotations by spectral matching. Nat. Commun. 8, 1494 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  101. Sleno, L. & Volmer, D. A. Ion activation methods for tandem mass spectrometry. J. Mass. Spectrom. 39, 1091–1112 (2004).

    Article  CAS  PubMed  Google Scholar 

  102. Tang, Z. & Guengerich, F. P. Dansylation of unactivated alcohols for improved mass spectral sensitivity and application to analysis of cytochrome P450 oxidation products in tissue extracts. Anal. Chem. 82, 7706–7712 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Bazsó, F. L. et al. Quantitative comparison of tandem mass spectra obtained on various instruments. J. Am. So. c. Mass Spectrom. 27, 1357–1365 (2016).

    Article  CAS  Google Scholar 

  104. Bowen, B. P. & Northen, T. R. Dealing with the unknown: metabolomics and metabolite atlases. J. Am. Soc. Mass Spectrom. 21, 1471–1476 (2010).

    Article  CAS  PubMed  Google Scholar 

  105. da Silva, R. R., Dorrestein, P. C. & Quinn, R. A. Illuminating the dark matter in metabolomics. Proc. Natl Acad. Sci. USA 112, 12549–12550 (2015).

    Article  PubMed  CAS  Google Scholar 

  106. Blaženović, I., Kind, T., Ji, J. & Fiehn, O. Software tools and approaches for compound identification of LC-MS/MS data in metabolomics. Metabolites 8, 31 (2018).

    Article  CAS  PubMed Central  Google Scholar 

  107. Ruttkies, C., Schymanski, E. L., Wolf, S., Hollender, J. & Neumann, S. MetFrag relaunched: incorporating strategies beyond in silico fragmentation. J. Cheminf. 8, 3 (2016).

    Article  CAS  Google Scholar 

  108. Gerlich, M. & Neumann, S. MetFusion: integration of compound identification strategies. J. Mass Spectrom. 48, 291–298 (2013).

    Article  CAS  PubMed  Google Scholar 

  109. Böcker, S., Letzel, M. C., Liptak, Z. & Pervukhin, A. SIRIUS: decomposing isotope patterns for metabolite identification. Bioinformatics 25, 218–224 (2009).

    Article  PubMed  CAS  Google Scholar 

  110. Dührkop, K. et al. SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nat. Methods 16, 299–302 (2019).

    Article  PubMed  CAS  Google Scholar 

  111. Dührkop, K., Shen, H., Meusel, M., Rousu, J. & Bocker, S. Searching molecular structure databases with tandem mass spectra using CSI:FingerID. Proc. Natl Acad. Sci. USA 112, 12580–12585 (2015).

    Article  PubMed  CAS  Google Scholar 

  112. Tsugawa, H. et al. Hydrogen rearrangement rules: computational MS/MS fragmentation and structure elucidation using MS-FINDER software. Anal. Chem. 88, 7946–7958 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Protsyuk, I. et al. 3D molecular cartography using LC-MS facilitated by Optimus and ‘ili software. Nat. Protoc. 13, 134–154 (2018).

    Article  CAS  PubMed  Google Scholar 

  114. Röst, H. L. et al. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat. Methods 13, 741–748 (2016).

    Article  PubMed  CAS  Google Scholar 

  115. Pluskal, T., Castillo, S., Villar-Briones, A. & Oresic, M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinforma. 11, 395 (2010).

    Article  CAS  Google Scholar 

  116. Deutsch, E. W. et al. Proteomics Standards Initiative: fifteen years of progress and future work. J. Proteome Res. 16, 4288–4298 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Brooksbank, C., Cameron, G. & Thornton, J. The European Bioinformatics Institute’s data resources. Nucleic Acids Re. s. 38, D17–D25 (2010).

    Article  CAS  Google Scholar 

  118. McLafferty, F. W. & Tureček, F. Interpretation of Mass Spectra 4th edn (University Science Books, 1993).

  119. Cleary, J. L., Luu, G. T., Pierce, E. C., Dutton, R. J. & Sanchez, L. M. BLANKA: an algorithm for blank subtraction in mass spectrometry of complex biological samples. J. Am. Soc. Mass Spectrom. 30, 1426–1434 (2019).

    Article  CAS  PubMed  Google Scholar 

  120. Sumner, L. W. et al. Proposed minimum reporting standards for chemical analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI). Metabolomics 3, 211–221 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  121. Viant, M. R., Kurland, I. J., Jones, M. R. & Dunn, W. B. How close are we to complete annotation of metabolomes? Curr. Opin. Chem. Biol. 36, 64–69 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  122. Wang, J., Peake, D. A., Mistrik, R., Huang, Y. & Araujo, G. D. A Platform to Identify Endogenous Metabolites Using a Novel High Performance Orbitrap MS and the mzCloud Library. http://www.unitylabservices.eu/content/dam/tfs/ATG/CMD/CMD%20Documents/posters/PN-ASMS13-a-platform-to-identify-endogenous-metabolites-using-a-novel-high-performance-orbitrap-and-the-mzcloud-library-E.pdf (Thermo Scientific, 2013).

  123. Shahaf, N. et al. The WEIZMASS spectral library for high-confidence metabolite identification. Nat. Commun. 7, 12423 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  124. Schymanski, E. L. et al. Identifying small molecules via high resolution mass spectrometry: communicating confidence. Environ. Sci. Technol. 48, 2097–2098 (2014).

    Article  CAS  PubMed  Google Scholar 

  125. Demarque, D. P., Crotti, A. E. M., Vessecchi, R., Lopes, J. L. C. & Lopes, N. P. Fragmentation reactions using electrospray ionization mass spectrometry: an important tool for the structural elucidation and characterization of synthetic and natural products. Nat. Prod. Rep. 33, 432–455 (2016).

    Article  CAS  PubMed  Google Scholar 

  126. van der Hooft, J. J. J., Wandy, J., Barrett, M. P., Burgess, K. E. V. & Rogers, S. Topic modeling for untargeted substructure exploration in metabolomics. Proc. Natl Acad. Sci. USA 113, 13738–13743 (2016).

    Article  PubMed  CAS  Google Scholar 

  127. Marfey, P. Determination of D-amino acids .2. Use of a bifunctional reagent, 1,5-difluoro-2,4-dinitrobenzene. Carlsberg Res. Commun. 49, 591–596 (1984).

    Article  CAS  Google Scholar 

  128. Su, G., Morris, J. H., Demchak, B. & Bader, G. D. Biological network exploration with Cytoscape 3. Curr. Protoc. Bioinforma. 47, 8 13 11–24 (2014).

    Article  Google Scholar 

  129. Smoot, M. E., Ono, K., Ruscheinski, J., Wang, P. L. & Ideker, T. Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27, 431–432 (2011).

    Article  CAS  PubMed  Google Scholar 

  130. Sandhu, C. et al. Evaluation of data-dependent versus targeted shotgun proteomic approaches for monitoring transcription factor expression in breast cancer. J. Proteome Res. 7, 1529–1541 (2008).

    Article  CAS  PubMed  Google Scholar 

  131. Hubert, J., Nuzillard, J.-M. & Renault, J.-H. Dereplication strategies in natural product research: how many tools and methodologies behind the same concept? Phytochem. Rev. 16, 55–95 (2017).

    Article  CAS  Google Scholar 

  132. Rochat, B. Proposed confidence scale and ID score in the identification of known-unknown compounds using high resolution MS data. J. Am. Soc. Mass Spectrom. 28, 709–723 (2017).

    Article  CAS  PubMed  Google Scholar 

  133. All natural. Nat. Chem. Biol. 3, 351 (2007).

  134. IUPAC (International Union of Pure and Applied Chemistry). Compendium of Chemical Terminology—The “Gold Book” (eds McNaught, A. D. & Wilkinson, A.) (Blackwell Scientific Publications, 1997).

  135. McLafferty, F. W. Tandem mass spectrometry. Science 214, 280–287 (1981).

    Article  CAS  PubMed  Google Scholar 

  136. Gross, J. H. Mass Spectrometry: A Textbook 415-478 (Springer, 2011).

  137. Vazquez-Baeza, Y., Pirrung, M., Gonzalez, A. & Knight, R. EMPeror: a tool for visualizing high-throughput microbial community data. GigaScience 2, 16 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  138. McDonald, D. et al. The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome. GigaScience 1, 7 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  139. Bolyen, E. et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat. Biotechnol. 37, 852–857 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  140. Wang, M. et al. Mass spectrometry searches using MASST. Nat. Biotechnol. 38, 23–26 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  141. Jarmusch, A. K. et al. Repository-scale co- and re-analysis of tandem mass spectrometry data. Preprint at https://www.biorxiv.org/content/10.1101/750471v1 (2019).

  142. Olivon, F., Grelier, G., Roussi, F., Litaudon, M. & Touboul, D. MZmine 2 data-preprocessing to enhance molecular networking reliability. Anal. Chem. 89, 7836–7840 (2017).

    Article  CAS  PubMed  Google Scholar 

  143. Winnikoff, J. R., Glukhov, E., Watrous, J., Dorrestein, P. C. & Gerwick, W. H. Quantitative molecular networking to profile marine cyanobacterial metabolomes. J. Antibiot. (Tokyo) 67, 105–112 (2014).

    Article  CAS  Google Scholar 

  144. Tsugawa, H. et al. MS-DIAL: data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat. Methods 12, 523–526 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  145. Jones, A. R. et al. The mzIdentML data standard for mass spectrometry-based proteomics results. Mol. Cell. Proteom. 11, M111.014381 (2012).

    Article  CAS  Google Scholar 

  146. Griss, J. et al. The mzTab data exchange format: communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience. Mol. Cell. Proteom. 13, 2765–2775 (2014).

    Article  CAS  Google Scholar 

  147. Hoffmann, N. et al. mzTab-M: a data standard for sharing quantitative results in mass spectrometry metabolomics. Anal. Chem. 91, 3302–3310 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We acknowledge funding from the following: National Research System (SNI) of SENACYT Panama (C.A.B.P., M.H.C., J.L.-B., and M.G.); the Gordon and Betty Moore Foundation (P.C.D., N.B., and K.L.M.); the National Institutes of Health (GM122016-01; K.L.M.); the National Science Foundation (DEB1354944; R.M.T.) and (IOS-1656481; A.M.C.R and P.C.D). A.K.J. acknowledges an American Society for Mass Spectrometry 2018 Postdoctoral Career Development Award. D.P. was supported through the Deutsche Forschungsgemeinschaft (DFG; PE 2600/1). F.T. and N.N. acknowledge Shimadzu South Africa (Pty) Ltd for the support and training. We are grateful for grant R03 CA211211 (P.C.D.) on reuse of metabolomics data and grant P41 GM103484 (P.C.D., N.B.) to the Center for Computational Mass Spectrometry, as well as instrument support though NIH S10RR029121 (P.C.D.). A.I.C. and Y.Z. were supported through an Auburn University Presidential Award for Interdisciplinary Research (PAIR).

Author information

Authors and Affiliations

Authors

Contributions

Design and oversight of the project: P.C.D., M.W., N.B. Instrument acquisition parameters: A.T.A., E.C.G., R.A.K., K.L.M., R.M.T., K.B.K., S.B., C.R., A.W.T., F.T., N.N., A.K.J., A.M.U. Data conversion and upload: K.L.M., E.C.G., A.T.A., J.J.J.v.d.H., M.E. GNPS documentation: M.W., L.F.N., E.C.G., A.T.A., K.L.M., J.J.J.v.d.H., M.E., M.N.-E. Cytoscape documentation: M.N.-E., F.V., K.C.W., I.K., A.M.C.-R. Metadata curation: J.M.G., C.M.A., F.V., A.M.C.-R. Mass spectra annotations: D.P., R.S., M.E. Theoretical tools and advanced features, statistical analysis: L.F.N., A.A.A. Supplementary information: A.T.A., N.S., E.C.G., K.L.M., M.E. Testing the workflows described and improving the descriptions: Y.Z., A.I.C., A.B., K.S., N.T., A.M.U., J.A.T.M., M.H.C., C.A.B.P., M.G., V.V.-C., J.L.-B., R.M.-F., M.E.

Corresponding authors

Correspondence to Nuno Bandeira, Mingxun Wang or Pieter C. Dorrestein.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Protocols thanks Vinayak Agarwal, Mehdi Beniddir, Alfonso Mangoni and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key reference(s) using this protocol

Vermeeren, P., Sun, X. & Bickelhaupt, F.M. Sci. Rep. 8, 10729 (2018): https://doi.org/10.1038/s41598-018-28998-3

Sun, X., Soini, T. M., Poater, J., Hamlin, T. A. & Bickelhaupt, F. M. J. Comput. Chem. 40, 2227–2233 (2019): https://doi.org/10.1002/jcc.25871

Key data used in this protocol

Quinn, R. A. et al. Sci. Adv. 4, eaau1908 (2018): https://doi.org/10.1126/sciadv.aau1908

Wang, M. et al. Nat. Biotechnol. 34, 828–837 (2016): https://doi.org/10.1038/nbt.3597

Garg, N. et al. Cell Host Microbe 22, 705–716.e4 (2017): https://doi.org/10.1016/j.chom.2017.10.001

Preprint version of this protocol

Aron, A. T. et al. Preprint at https://doi.org/10.26434/chemrxiv.9333212.v1 (2019)

Supplementary information

Supplementary Information

Supplementary Figs. 1–8, Supplementary Methods and Supplementary Tables 1–4.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aron, A.T., Gentry, E.C., McPhail, K.L. et al. Reproducible molecular networking of untargeted mass spectrometry data using GNPS. Nat Protoc 15, 1954–1991 (2020). https://doi.org/10.1038/s41596-020-0317-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41596-020-0317-5

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research