Main

Changes in the human microbiome have been associated with many disease and health states1. However, reporting the results of human microbiome research is challenging, as it often involves approaches from microbiology, genomics, biomedicine, bioinformatics, statistics, epidemiology and other fields, which results in a lack of consistent recommendations for the reporting of methods and results. Inconsistent reporting can have consequences for the field by affecting the reproducibility of study results2. Although researchers have called for better reporting standards3, such as the Genomic Standards Consortium’s MIxS checklist4, to provide a means for reporting sampling, processing and data generation, no comprehensive standardized guidelines spanning laboratory and epidemiological reporting of microbiome studies have been proposed.

Standard reporting guidelines promote research consistency and, as a consequence, encourage reproducibility and improved study design. Editorial adoption of the CONSORT (Consolidated Standards of Reporting Trials) guidelines, for example, has been associated with an increase in the quality of trial reporting5,6. Other epidemiological reporting guidelines have seen broad adoption, such as STROBE (Strengthening the Reporting of Observational studies in Epidemiology)7 and STREGA (Strengthening the Reporting of Genetic Association Studies)8. STROBE-metagenomics9 proposes an extension to the STROBE checklist for metagenomics studies. Subsequent to the MIAME (Minimum Information About a Microarray Experiment) checklist9, the MIMARKS (Minimum Information about a MARKer gene Sequence) and MIxS (Minimum Information about any (x) Sequence) checklists provide detailed guidance on the reporting of sequencing studies in general. These are focused on the technical aspects of data generation, however, as are projects such as the MBQC (Microbiome Quality Control) project10 and IHMS (International Human Microbiome Standards)11,12. Together, these serve as useful foundations, but they do not span the full range of reporting of human microbiome studies or include items intended for other types of studies, and they provide limited guidance on manuscript preparation.

Studies of the human microbiome share many features with studies of other types of molecular epidemiology, but they also require unique considerations with their own methodological best practices and reporting standards. In addition to standard elements of epidemiological study design, culture-independent microbiome studies involve the collection, handling and preservation of biological specimens; evolving approaches to laboratory processing with elevated potential for batch effects; bioinformatics processing; statistical analysis of sparse, unusually distributed, high-dimensional data; and reporting of results on potentially thousands of microbial features13,14,15. Because there is no agreed-upon gold-standard method for microbiome research and the field has not reached consensus on many of these aspects, inconsistencies in reporting inhibit reproducibility and hamper efforts to draw conclusions across similar studies.

For these reasons, we convened a multi-disciplinary working group to develop guidelines tailored to microbiome study reporting. Members of this group include epidemiologists, biostatisticians, bioinformaticians, physician-scientists, genomicists and microbiologists. The checklist is designed to balance completeness with burden of use and is applicable to a broad range of human microbiome study designs and analysis. The STORMS (Strengthening The Organizing and Reporting of Microbiome Studies) checklist (Supplementary Table 1) draws relevant items from related guidelines and adds new tailored guidelines to serve as a tool to organize study planning and manuscript preparation, to improve the clarity of manuscripts and to facilitate reviewers and readers in assessing these studies.

Methodology

STORMS was the result of a collaborative development process. In this section we discuss the methodology used to prepare the STORMS guidelines.

Origin and development

The origins of these guidelines are rooted in a project to create a standardized database of published literature reporting relationships between the microbiome and disease (https://bugsigdb.org/; website in preparation). The goal of that project is to create a publicly available, standardized database of microbiome study findings indexed by condition of interest (for example, disease, health status, diet or environmental factor), microbiome site (for example, gut, mouth or skin) and microbial taxonomy to aid comparative analysis. As of August 2021, 31 curators (Supplementary Table 2) had extracted findings from 513 unique published studies (Supplementary Table 3). Included studies must have examined the relationship between the microbiome and a condition of interest and must have included findings on a taxonomic level (even if all findings were null).

This review revealed substantial reporting heterogeneity, particularly for epidemiology, such as study design, confounding factors and sources of bias. It also revealed microbiome-specific issues, including statistical analysis of compositional relative abundance data and handling of ‘batch’ effects16. This heterogeneity highlighted the need for standardized reporting guidelines, similar to those used in other fields of study. The curators determined that standardized reporting guidelines would streamline the review process but would also, more importantly, help researchers throughout the field of microbiome research communicate their findings effectively.

The resulting multidisciplinary group of bioinformaticians, epidemiologists, biostatisticians and microbiologists was thus convened to discuss microbiome reporting standards. The group began by reviewing existing reporting standards, including STROBE17, STROBE-ME18, STREGA8, MICRO17, MIMARKS4 and STROGAR19. The group also reviewed existing articles containing recommendations for microbiome reporting20,21. The STROBE and STREGA guidelines were used as a starting point for the STORMS checklist, although aspects were incorporated from the other reporting standards.

Following the guidelines on the development of reporting standards recommended by EQUATOR, the group created a comprehensive list of potential guideline items. From this list, group members added, modified and removed items on the basis of their expertise. After the first round of edits, the checklist was then applied to a recent microbiome study22 by group members. Comments, removals and additions were harmonized after each round. On the basis of this process, additional changes, simplifications and clarifications were made. This process was repeated until there was a group consensus that the checklist was ready for use.

In addition to the core working group, outside subject-matter experts identified by members of the working group were then invited to review the guidelines and provide feedback as members of the STORMS Consortium. Substantive feedback (that is, not grammar, spelling or other small changes) from 46 authors was organized by topic and compiled into a feedback document, and this was responded to as in a response to reviewer’s letter. After this round of revisions, consortium members were once again invited to review the checklist before submission for publication.

Elaboration and explanation of checklist items

This section describes the items in the STORMS checklist.

Checklist

The latest version of the checklist and a summary of items at the time of publication are presented in Supplementary Table 1; updated versions will be posted online (https://stormsmicrobiome.org). Of the items in the latest version in the STORMS checklist, nine items or sub-items were unchanged from STROBE, three were modified from STROBE, one was modified from STREGA, and fifty-seven new guidelines were developed. Nine items that overlap MIxS are specified. Rationales for new and modified items are presented below. Documentation of items unmodified from STROBE and STREGA were presented in the publications of those checklists.

Abstract (1.0–1.3)

Along with commonly included abstract materials, such as a basic description of the participants and results, authors should report the study design23 — such as a cross-sectional, case-control, cohort or randomized controlled trial — in the abstract of their article (item 1.1), as required by other reporting guidelines. Communication of the study design in the abstract allows readers to quickly categorize the type of evidence provided. As part of this basic description, sequencing methods should be mentioned (item 1.2). Body site(s) sampled should also be included (item 1.3).

Introduction (2.0–2.1)

The introduction should clearly describe the underlying background, evidence or theory that motivated the current study (item 2.0). Among other possibilities, this could include pilot study data, previous findings from a similar study or topic or a biologically plausible mechanism that has been proposed. This clarifies for the reader the motivations for the present study. If the study is exploratory in nature, the introduction should explain what motivated the current exploration and the goals of the exploratory study. The hypothesis developed on the basis of the background should be included. If the study was exploratory and did not define a hypothesis, pre-specified study objectives should be included (item 2.1).

Methods (3.0–8.5)

Methods constitute a majority of the checklist, as outlined below.

Participants (3.0–3.9)

The methods section should contain sufficient information for study replicability. Because study design is essential to understanding a study, this should be stated in the methods (item 3.0). In the description of the participants in the study, the population of interest should be described, as well as how participants were sampled from the source population (item 3.1). Because participant characteristics such as environment24, lifestyle behaviors, diet, biomedical interventions, demographics25 and geography26 (item 3.2) can correspond with substantial differences in the microbiome, it is essential to include this description. Temporal context can be important as well, so start and end dates for recruitment, follow-up and data collection should be stated (item 3.3).

Specific criteria used to assess potential participants for eligibility in the study should also be reported, with details of both inclusion criteria and exclusion criteria (item 3.4). Inclusion and exclusion criteria are pre-established characteristics used for the selection of participants into a study, and describing these criteria is essential for understanding a study’s target population27. This is expanded from STROBE, which requires eligibility criteria but does not specify that both inclusion criteria and exclusion criteria should be reported in detail. Any information collected about antibiotics or other treatments that could affect the microbiome should be described (item 3.5), as well as if any exclusion criteria included recent use of antibiotics or other medications.

The final analytic sample sizes should be stated, as well as the reason for any exclusion of participants at any step of the recruitment, follow-up or laboratory processes (item 3.6). STROBE suggests using a flowchart to show when and why participants were removed from the study. A template for such flowcharts is presented here (Fig. 1), and a public-domain version is available for re-use online (https://stormsmicrobiome.org/figures). If participants were lost to follow-up or did not complete all assessments in a longitudinal study, details on how follow-ups were conducted should be stated and time-point-specific sample sizes should also be reported (item 3.7). Additionally, studies that matched cases to controls should describe what variables were used in matching (item 3.8).

Fig. 1: Examples of flowcharts for item 3.6.
figure 1

Although they are not required by the STORMS guidelines, flowcharts can help in the visualization of how the final analytic sample was calculated.

Laboratory methods (4.0–4.17)

Since STROBE does not cover laboratory methods, new items were developed for the STORMS checklist. Laboratory methods should be described in sufficient detail to allow replication. The handling of lab samples should be described, including procedures for sample collection (item 4.1), shipping (item 4.2) and storage (item 4.3).

Because DNA extraction can be a major source of technical differences across studies10, DNA extraction methods should be described (item 4.4). Description of the removal of human DNA and enrichment for microbial DNA, if performed, should also be included (item 4.5). Likewise, if positive controls (item 4.7), negative controls (item 4.8) or contaminant-mitigation methods (item 4.9) were used, they should be identified and described.

Sequencing-related methods should be reported. This includes primer selection and DNA amplification (including the variable region of the 16 S rRNA gene, if applicable) (item 4.6). Major divisions of sequencing strategy, such as shotgun or amplicon sequencing, should be identified (item 4.11). Finally, the methods used to determine relative abundances should be explained (item 4.12), and the read numbers that serve as denominators should be recorded.

Batch effects should be discussed as a potential source of confounding, including steps taken to ensure batch effects do not overlap exposures or outcomes of interest (item 4.13)28. If metatranscriptomics, metaproteomics or metabolomics are conducted, details of those methods should be provided (items 4.14–4.16).

Data sources/measurement (5.0)

For non-microbiome data (for example, health outcomes, participant socioeconomic, behavioral, dietary and biomedical characteristics, including disease location and activity, and environmental variables), the measurement and definition of each variable should be described (item 5.0). For example, a participant’s sex and age could be obtained from electronic medical records or from a questionnaire distributed to participants; this data source should be described. Limitations of measurements may also be discussed, including potential bias due to misclassification or missing data, as well as any attempts made to address these measurement issues.

Research design considerations for causal inference (6.0–6.1)

Observational data are often used to test associations that aim toward causal inference in situations in which the hypothesized causal relationship is not directly observed. Methods include, for example, the use of multivariable analysis or matching to adjust for confounding variables between a hypothesized exposure (such as abundance of a microbial taxon) and the disease or condition under study29. Confounders can be thought of as common causes of the exposure and the outcome under study that can induce a spurious association between the exposure and the outcome30,31. For example, age could be a common confounder due to its influence on the microbiome and on the risk of most health outcomes32. Laboratory batch effects could also confound relationships between the microbiome and a condition of interest if steps are not taken to avoid imbalance of the condition across batches33. A common method for attempting to control for measured confounding is to adjust for or stratify on the confounder30. Justification should be provided for variables included or excluded in regression models for causal inference (item 6.0), as adjusting for or stratifying on a non-confounding variable can introduce bias34. As part of this theoretical justification, authors should consider including a directed acyclic graph showing the hypothesized causal relationships of interest35,36.

In addition to consideration of the theoretical motivations for the present study, the potential for selection or survival bias that can distort the observed relationship between the microbiome and variable of interest should be discussed (item 6.1). For example, such bias may occur due to loss to follow-up (in longitudinal studies) or due to lack of inclusion of participants in the study due to the condition itself (for example, participants who have died of aggressive forms of colorectal cancer and have not survived to be in a hypothetical study of colorectal cancer microbiomes)37. Other items elsewhere in the checklist may be directly relevant to questions of causal inference, including hypotheses (item 2.1), study design (item 3.0), matching (item 3.8), bias (item 13.1) and generalizability (item 13.2). Authors investigating causal questions are encouraged to consider their reporting on these items in the context of causal inference as well.

Bioinformatics and Statistical Methods (7.0–7.9)

Adequate description of bioinformatics and statistical methods is essential to the production of a rigorous and reproducible research report. Data transformations (such as normalization, rarefaction and percentages) should be described (item 7.0). Quality-control methods should be fully disclosed, including criteria for filtering or removing reads or samples (item 7.1). All statistical methods used to analyze the data should be stated (item 7.3), including how results of interest were selected (for example, use of a P value, q value or other threshold) (item 7.8). Taxonomic, functional profiling or other sequence analysis methods should be described in detail (item 7.2). In the interest of reproducibility, all software, packages, databases and libraries used for the pre-processing and analysis of the data should be described and cited, including version numbers (item 7.9).

Reproducible research (8.0–8.5)

Reproducible research practices serve as quality checks in the process of publication and further transparency and knowledge sharing, as detailed in the rubric proposed by Schloss38. Journals are increasingly implementing reproducible research standards that include the publishing of data and code, and those guidelines should be followed when possible39,40. STORMS itemizes the accessibility of data, methods and code (items 8.0–8.5). If possible, raw data (item 8.1) and processed data (item 8.2) should be deposited into independently maintained public repositories that provide long-term availability, such those maintained by NCBI or EMBL-EBI. Repositories such as Zenodo (https://zenodo.org/) or Publisso (https://www.publisso.de/en/) can be used to provide a DOI for processed datasets. If data or code are not or cannot be made publicly available, even in a repository that provides restricted-access options, a description of how interested readers can access the data should be provided. As stated in item 8.0, any protected information should be described, along with how such data can be accessed.

Results (9.0–10.4)

The results should be reported as outlined below.

Descriptive data (9.0)

Descriptive statistics about the study population should be reported (item 9.0). At a minimum, the age and sex of the study population should be described, and age and sex should be included for each participant in shared data files, but other important participant characteristics should be reported when possible, including medication use or lifestyle factors such as diet. Authors should consider reporting these data in a descriptive statistics table. Packages such as the table1 package in R software make the creation of such a table straightforward41.

Outcome data (10.0–10.4)

The main outcomes of the study should be detailed, including descriptive information, findings of interest and the results of any additional analyses. Descriptive microbiome analysis (for example, dimension reduction, such as principal coordinates analysis, measures of diversity and gross taxonomic composition) should be reported for each group and each time point (item 10.0). This contextualizes the results of differential abundance analysis for readers. When differential abundance test results are reported, the magnitude and direction of differential abundance should be clearly stated (item 10.2) for each identifiable standardized taxonomic unit (item 10.1). Results from other types of analyses, such as metabolic function, functional potential, MAG assembly and RNAseq, should be described in the results as well (items 10.3 and 10.4). Additional results (for example, non-significant results or full differential abundance results) can be included in supplements and should not be excluded entirely. Although the problem has been known for decades39, journals across many fields are recognizing the issue of publication bias and therefore the issue of non-reporting of null results40. Including such results in publications will help to reduce the severity of this bias and improve future systematic reviews and meta-analyses.

Discussion (11.0–14.0)

Most recommendations for the discussion section are similar to those of STROBE, including a discussion of the limitations of the present study and associated methods (item 13.0). One additional recommendation is made in the STORMS guidelines: the potential for biases and how they would influence the study findings (item 13.1) should be discussed. Many forms of bias, such as residual/unmeasured confounding, bias related to compositional analysis42, measurement bias or selection bias43, could affect the interpretation of the results of the study, and it is important to acknowledge potential sources of bias in discussion of the results44. As described in STROBE, authors should also consider the generalizability of their findings and if these findings could be applicable to the target population or other populations (item 13.2). If different forms of bias were not assessed or assumed to be negligible, this should be stated. Finally, authors should discuss potential future or ongoing research based on findings of the present study (item 14.0).

Other information (15.0–17.0)

In addition to including a statement of funding (item 15.0), authors should also include acknowledgements and a conflicts of interest statement (items 15.1 and 15.2, respectively). Conflicts of interest statements should be written according to the criteria established by the journal. Finally, the paper should state where supplementary materials and data can be accessed (items 16.0 and 17.0).

Implementation

The STORMS checklist can streamline peer review by providing both a checklist for assessing for completeness and a roadmap with pointers to the manuscript. We recommend that before submission, authors use the ‘Comments’ field to provide explanations where warranted and to refer to relevant sections of the manuscript, to make the work of peer reviewers more straightforward and more accurate. We provide two examples of pre-publication use of STORMS: (i) a multi-site study of associations between essential hypertension and gut microbial metabolic pathways45; and (ii) an observational study of the stool microbiota of multiple host species46. Additional post-publication examples are also available47,48 that highlight the fact that references to line numbers must be updated during production or they will continue to refer to the pre-print version.

Discussion

The STORMS checklist for reporting on human microbiome studies was developed with the following priorities: the checklist should (1) be easy to understand and use by researchers from various fields, through straightforward use of language and pruning of items rarely relevant to the current literature; (2) be organized in the outline of a manuscript, so it can serve as a tool for authors and for peer reviewers, particularly when included in manuscript submission as a supplemental table with comments; (3) assist in the complete and organized reporting of a study, not in enforcing any particular methods; (4) reuse or modify items from related checklists where relevant; and (5) represent consensus across a broad cross-section of the human microbiome research community. The checklist facilitates manuscript authors in providing a complete, concise and organized description of their study and its findings. Included as a supplemental table to a manuscript, it also supports efficient peer review and post-publication interpretation.

Although other efforts to extend STROBE for microbiome and metagenomics studies have been proposed9 and laboratory-focused reporting checklists have been released4,49, to our knowledge, STORMS is the first comprehensive reporting checklist for human microbiome research. We aim for the STORMS guidelines to improve the quality and transparency of microbiome epidemiology studies by introducing a shared grammar of study reporting in a structured checklist format. Reporting checklists introduced in other disciplines have been shown to improve the quality of journal articles5,6.

A major strength of the STORMS checklist is the rigor and transparency of its development by a diverse, multidisciplinary consortium of subject-matter experts. The development of the STORMS checklist is an ongoing process, and new versions of the checklist will be released to reflect evolving standards and technological processes. A version-control system with a change log has been implemented, and annual reviews of the checklist are planned. Additionally, the working group plans to evaluate the impact of the STORMS checklist on microbiome reporting by examining how many articles fulfill checklist items before and after its release. We invite interested readers to join the STORMS Consortium by contacting the corresponding author or by visiting the consortium website for more information (https://www.stormsmicrobiome.org/). We also encourage journals to include the STORMS checklist in their instructions to authors and to advise peer reviewers to consult the checklist when reviewing submissions.

There are some limitations to the STORMS checklist. The checklist was not created to assess study or methodological rigor. It is meant to aid authors’ organization and ease the process of reader assessment of how studies are conducted and analyzed. Conclusions about the quality of studies should not be made on the basis of their adherence to STORMS guidelines, although we expect the reporting guidelines to help readers review studies critically. The STORMS checklist does not encourage, discourage or assume the use of null hypothesis significance testing50 or methods of compositional data analysis51, topics of some controversy in the field. In general, the checklist avoids reference to or guidance on specific statistical methodological decisions.

Through the efforts of the STORMS Consortium working in an iterative, transparent and collaborative process, the STORMS checklist provides a roadmap for researchers in reporting the results of a human microbiome study. The STORMS Consortium believes that the checklist is sufficiently flexible and user-friendly to support widespread adoption and contribution to microbiome study standards. Its adoption will ideally encourage thoughtful study design, reproducibility, collaboration and open knowledge sharing between research groups as they explore the human microbiome.