Abstract
Purpose
The reference databases play a pivotal role in amplicon microbiome research, however these databases differ in the sequence content and taxonomic information available. Studies on mock community and human health microbiome have revealed the problems associated with the choice of reference database on the outcome. Nonetheless, the influence of reference databases in environmental microbiome studies is not explicitly illustrated.
Methods
This study analyzed the amplicon (V1V3, V3V4, V4V5 and V6V8) data of 128 soil samples and evaluated the impact of 16S rRNA databases, Genome Taxonomy Database (GTDB), Ribosomal Database Project (RDP), SILVA and Consensus Taxonomy (ConTax), on microbiome inference.
Results
The analyses showed that the distribution of observed amplicon sequence variants was significantly different (P-value < 2.647e−12) across four datasets, generated using different databases for each amplicon region. In addition, the beta diversity was also found to be altered by different databases. Further investigation revealed that the microbiome composition inferred by various databases differ significantly (P-value = 0.001), irrespective of amplicon regions. This study, found that the core-microbiome structure in environmental studies is influenced by the type of reference database used.
Conclusion
In summary, this present study illustrates that the choice of reference database could influence the outcome of environmental microbiome research.
Similar content being viewed by others
Data availability
Downloaded from SRA.
Code availability
Not applicable.
References
Almeida A, Mitchell AL, Tarkowska A, Finn RD (2018) Benchmarking taxonomic assignments based on 16S rRNA gene profiling of the microbiota from commonly sampled environments. GigaScience. https://doi.org/10.1093/gigascience/giy054
Arita M, Karsch-Mizrachi I, Cochrane G (2021) The international nucleotide sequence database collaboration. Nucleic Acids Res 49:D121–D124. https://doi.org/10.1093/nar/gkaa967
Balvočiūtė M, Huson DH (2017) SILVA, RDP, greengenes, NCBI and OTT—how do these taxonomies compare? BMC Genomics 18:114. https://doi.org/10.1186/s12864-017-3501-4
Bižić M, Klintzsch T, Ionescu D et al (2020) Aquatic and terrestrial cyanobacteria produce methane. Sci Adv 6:eaax5343. https://doi.org/10.1126/sciadv.aax5343
Boone DR, Castenholz RW, Garrity GM (eds) (2001) Bergey’s manual of systematic bacteriology, 2nd edn. Springer, New York
Callahan BJ, McMurdie PJ, Rosen MJ et al (2016) DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods 13:581–583. https://doi.org/10.1038/nmeth.3869
Caruso V, Song X, Asquith M, Karstens L (2019) Performance of microbiome sequence inference methods in environments with varying biomass. mSystems 4:8. https://doi.org/10.1128/mSystems.00163-18
Cole JR, Wang Q, Fish JA et al (2014) Ribosomal Database Project: data and tools for high throughput rRNA analysis. Nucl Acids Res 42:D633–D642. https://doi.org/10.1093/nar/gkt1244
Delgado-Baquerizo M, Oliverio AM, Brewer TE et al (2018) A global atlas of the dominant bacteria found in soil. Science 359:320–325. https://doi.org/10.1126/science.aap9516
Dick GJ, Baker BJ (2013) Omic approaches in microbial ecology: charting the unknown: analysis of whole-community sequence data is unveiling the diversity and function of specific microbial groups within uncultured phyla and across entire microbial ecosystems. Microbe Mag 8:353–360. https://doi.org/10.1128/microbe.8.353.1
Edgar R (2018) Taxonomy annotation and guide tree errors in 16S rRNA databases. PeerJ 6:e5030. https://doi.org/10.7717/peerj.5030
Gilbert JA, Jansson JK, Knight R (2014) The Earth Microbiome project: successes and aspirations. BMC Biol 12:69. https://doi.org/10.1186/s12915-014-0069-1
Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ (2017) Microbiome datasets are compositional: and this is not optional. Front Microbiol 8:2224. https://doi.org/10.3389/fmicb.2017.02224
Handelsman J (2004) Metagenomics: application of genomics to uncultured microorganisms. Microbiol Mol Biol Rev 68:669–685. https://doi.org/10.1128/MMBR.68.4.669-685.2004
Janssen PH, Yates PS, Grinton BE et al (2002) Improved culturability of soil bacteria and isolation in pure culture of novel members of the divisions Acidobacteria, Actinobacteria, Proteobacteria, and Verrucomicrobia. Appl Environ Microbiol 68:2391–2396. https://doi.org/10.1128/AEM.68.5.2391-2396.2002
Johnson JS, Spakowicz DJ, Hong B-Y et al (2019) Evaluation of 16S rRNA gene sequencing for species and strain-level microbiome analysis. Nat Commun 10:5029. https://doi.org/10.1038/s41467-019-13036-1
Liland KH, Vinje H, Snipen L (2017) microclass: an R-package for 16S taxonomy classification. BMC Bioinform 18:172. https://doi.org/10.1186/s12859-017-1583-2
Lundin D, Severin I, Logue JB et al (2012) Which sequencing depth is sufficient to describe patterns in bacterial α- and β-diversity?: Sequencing depth in diversity research. Environ Microbiol Rep 4:367–372. https://doi.org/10.1111/j.1758-2229.2012.00345.x
Lydon KA, Lipp EK (2018) Taxonomic annotation errors incorrectly assign the family Pseudoalteromonadaceae to the order Vibrionales in Greengenes: implications for microbial community assessments. PeerJ 6:e5248. https://doi.org/10.7717/peerj.5248
McMurdie PJ, Holmes S (2013) phyloseq: An R package for reproducible interactive analysis and graphics of microbiome census data. PLoS ONE 8:e61217. https://doi.org/10.1371/journal.pone.0061217
Murali A, Bhargava A, Wright ES (2018) IDTAXA: a novel approach for accurate taxonomic classification of microbiome sequences. Microbiome 6:140. https://doi.org/10.1186/s40168-018-0521-5
Oliverio AM, Geisen S, Delgado-Baquerizo M et al (2020) The global-scale distributions of soil protists and their contributions to belowground systems. Sci Adv 6:eaax8787. https://doi.org/10.1126/sciadv.aax8787
Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20:289–290. https://doi.org/10.1093/bioinformatics/btg412
Park S-C, Won S (2018) Evaluation of 16S rRNA databases for taxonomic assignments using a mock community. Genomics Inform 16:e24. https://doi.org/10.5808/GI.2018.16.4.e24
Parks DH, Chuvochina M, Waite DW et al (2018) A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36:996–1004. https://doi.org/10.1038/nbt.4229
Parks DH, Chuvochina M, Chaumeil P-A et al (2020) A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol 38:1079–1086. https://doi.org/10.1038/s41587-020-0501-8
Parte AC (2014) LPSN—list of prokaryotic names with standing in nomenclature. Nucl Acids Res 42:D613–D616. https://doi.org/10.1093/nar/gkt1111
Pham VHT, Kim J (2012) Cultivation of unculturable soil bacteria. Trends Biotechnol 30:475–484. https://doi.org/10.1016/j.tibtech.2012.05.007
Quast C, Pruesse E, Yilmaz P et al (2012) The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res 41:D590–D596. https://doi.org/10.1093/nar/gks1219
Ramakodi MP (2021a) Effect of amplicon sequencing depth in environmental microbiome research. Curr Microbiol 78:1026–1033. https://doi.org/10.1007/s00284-021-02345-8
Ramakodi MP (2021b) A comprehensive evaluation of single-end sequencing data analyses for environmental microbiome research. Arch Microbiol 203:6295–6302. https://doi.org/10.1007/s00203-021-02597-9
Robeson MS II, O’Rourke DR, Kaehler BD et al (2021) RESCRIPt: Reproducible sequence taxonomy reference database management for the masses. PLoS Comput Biol 17(11):e1009581. https://doi.org/10.1371/journal.pcbi.1009581
Sierra MA, Li Q, Pushalkar S et al (2020) The influences of bioinformatics tools and reference databases in analyzing the human oral microbial community. Genes 11:878. https://doi.org/10.3390/genes11080878
Soriano-Lerma A, Pérez-Carrasco V, Sánchez-Marañón M et al (2020) Influence of 16S rRNA target region on the outcome of microbiome studies in soil and saliva samples. Sci Rep 10:13637. https://doi.org/10.1038/s41598-020-70141-8
Steen AD, Crits-Christoph A, Carini P et al (2019) High proportions of bacteria and archaea across most biomes remain uncultured. ISME J 13:3126–3130. https://doi.org/10.1038/s41396-019-0484-y
Thompson LR, Sanders JG, McDonald D et al (2017) A communal catalogue reveals Earth’s multiscale microbial diversity. Nature 551:457–463. https://doi.org/10.1038/nature24621
Wickham H (2016) ggplot2: elegant graphics for data analysis, 2nd edn. Springer, Cham
Wickham H, Averick M, Bryan J et al (2019) Welcome to the Tidyverse. JOSS 4:1686. https://doi.org/10.21105/joss.01686
Wright ES (2016) Using DECIPHER v2.0 to analyze big biological sequence data in R. R J 8:352. https://doi.org/10.32614/RJ-2016-025
Yang B, Wang Y, Qian P-Y (2016) Sensitivity and correlation of hypervariable regions in 16S rRNA genes in phylogenetic analysis. BMC Bioinform 17:135. https://doi.org/10.1186/s12859-016-0992-y
Acknowledgements
I would like to thank Dr. Bhawna Dubey, Chief Scientific Officer, Reprocell Bioserve Biotechnologies Pvt. Ltd., Hyderabad for reviewing the manuscript. The effort of Soriano-Lerma et al. (2020) for making the data publicly available on SRA is highly appreciated. CSIR-NEERI is acknowledged for providing the necessary support to carry out the analyses. The manuscript draft is submitted in the Institute Repository under the KRC No.: CSIR-NEERI/KRC/2021/JUNE/HZC/1.
Funding
None.
Author information
Authors and Affiliations
Contributions
Single author.
Corresponding author
Ethics declarations
Conflict of interest
None.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Ramakodi, M.P. Influence of 16S rRNA reference databases in amplicon-based environmental microbiome research. Biotechnol Lett 44, 523–533 (2022). https://doi.org/10.1007/s10529-022-03233-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10529-022-03233-2