Keywords
Systematic Reviews, Research Methodology, Research Conduct, Sample Size, Decision-Making, Plethora of Evidence, COVID-19, Choric Kidney Disease (CKD),
Systematic Reviews, Research Methodology, Research Conduct, Sample Size, Decision-Making, Plethora of Evidence, COVID-19, Choric Kidney Disease (CKD),
Since the onset of the COVID-19 pandemic, there has been an exponential increase in the number of publications. In a period of one year from October 2019 through October 2020, more than 48,000 peer reviewed articles were published on PubMed related to COVID-19.1 This was in addition to the thousands of pre-prints that were made available and used in decision making.2 This high and accelerated research output, although attempting to meet knowledge gaps, comes with a set of limitations and challenges. One of those challenges faced by the systematic review community is the added burden of screening, abstracting, and synthesizing a plethora of evidence. For this reason, some reviewers have been triaging included studies based on specific criteria including the study sample size.3 This abstract was previously presented in the 7th International Medicine & Health Sciences Congress (IMedHSC) at Paris, France in December, 2021. It is published as an online publication.4
Conceptually, systematic reviews are considered the best source of evidence as they aim to comprehensively identify and summarize all evidence for a specific clinical question.5 For that reason, ideally reviewers incorporate studies of all sample sizes, especially when evaluating rare diseases.6 Systematic reviews identify studies of different sample sizes and types. However, there is no empirical evidence that excluding smaller studies would affect or bias the pooled results of risk factors on prognosis in cases of plethora of evidence. This rationale can be justified by: larger studies 1) are usually of higher quality,7 2) provide more precise effect estimates,8 3) contribute a higher weight to calculation of the overall effect estimate when using meta-analyses methods.
Sensitivity analyses are often used to determine the robustness of systematic reviews results. They examine the effect of changing specific assumptions, methods, and other variables on the overall results.9 In this paper, we aim to assess the robustness of the results based on study sample sizes. We assess the difference between the overall effect estimates from studies of all sample sizes and those obtained after excluding smaller studies. We used a case study of the association between chronic kidney disease (CKD) and COVID-19 mortality.
This study was done as a sensitivity analysis to a previously completed systematic review based on a case study of COVID-19 mortality in patients with CKD. We conducted this analysis to assess the effect of sample size in individual studies and to evaluate if it would affect the pooled effect estimate. The goal of the original review was to assess the outcomes of COVID-19 in patients with CKD.10 In the previous study, a team of reviewers (AE, PT, SJ, RM, AG) conducted a comprehensive systematic review in published databases (Embase, Epistemonikos, and Medline), and unpublished databases (Medrxiv, LitCOVID and SSRN) between January 1, 2020 to January 10, 2021. We also included patients from four registries; Hilbrands 2020,11 Holman 2020,12 Jager,2020,13 and Williamson 2020.14 We used a variety of keywords and subheadings on CKD and COVID-19 to identify all studies on the topic. Included studies were of any study type; cross-sectional, prospective, and retrospective cohort studies. The full search strategy can be found in supplementary data.15 As this review was based on existing literature, there was no involvement of patients or members of the public.
Reviewers included studies that report the risk of mortality among patients with CKD and COVID-19. Studies that did not report data on mortality, patients with CKD and COVID-19, or did not separate results for patients with CKD were excluded. The selection process was conducted using Endnote or standardized excl sheet. We grouped studies based on the type of effect estimate reported: hazard ratio (HR), risk ratio (RR), or odds ratio (OR). We pooled the data and created forest plots using a random effect model using RevMan 5.4.16 We assessed publication bias17 by performing a visual inspection to judge asymmetry of funnel plots. The full methods and results of the review were published separately.10 More studies were included in the original review as studies reporting outcomes other than mortality were also included. We extracted data using pre-tested excel sheet. Reviewers abstracted the following information from each study; first author name, date and country of publication, population characteristics (age, gender, prevalence of CKD), effect estimates with the associated 95% confidence interval (CI). Two reviewers separately completed screening and data abstraction and resolved disagreement by discussion or arbitration. To compare the effect of different sample sizes on the overall effect estimate we conducted a sensitivity analysis based on sample size per study. We utilized three different cut-offs to define smaller studies: 500, 1,000, and 2,000 patients. These cut-offs included 20%, 40%, and 50% of the studies.
75 studies were eligible for inclusion out of 137 included in the original review with a total of 30,566,586 participants. 70 studies were cohort studies, four were cross-sectional studies and only one was a case control study. 65 studies were reported in published databases and 10 were reported in unpublished databases. Out of the 75 included studies, 40 had a sample size of >2,000 patients, seven studies with 1,000-2,000, 11 studies with 500-1,000 patients, and 17 studies had <500 patients. The grouping of those for the meta-analysis was based on the reported effect estimate, OR, HR, or RR (see Table 1). The flow chart of the original review is available in supplementary data.15
Reported EE* | Studies with <500 patients | Studies with <1,000 patients | Studies with <2,000 patients | Studies with >2,000 patients |
---|---|---|---|---|
HR* | 7 | 11 | 14 | 17 |
OR* | 7 | 14 | 18 | 20 |
RR* | 3 | 3 | 3 | 3 |
Total | 17 | 28 | 35 | 40 |
Across the effects estimates RR, OR, and HR, there was an increased risk of mortality in patients with CKD and COVID-19. Based on all included studies of all sample sizes, the HR was 1.57 (95%CI.42-1.73), the OR was 1.86 (95% CI 1.64-2.11), and the RR was 2.58 (95% CI 1.08-6.16) (see Table 2).
EE* | Overall | Cut-off of 500 patients | Cut-off of 1000 patients | Cut-off of 2000 patients | |||
---|---|---|---|---|---|---|---|
Studies with <500 | Studies with >500 | Studies with <1,000 | Studies with >1,000 | Studies with <2,000 | Studies with >2,000 | ||
HR* [95% CI*] | 1.57 [1.42, 1.73] | 1.81 [1.34, 2.44] | 1.54 [1.39, 1.71] | 2.07 [1.58, 2.71] | 1.48 [1.33, 1.65] | 1.93 [1.58, 2.37] | 1.46 [1.30, 1.63] |
OR* [95% CI*] | 1.86 [1.64, 2.11] | 1.95 [1.28, 2.97] | 1.84 [1.61, 2.10] | 2.09 [1.54, 2.84] | 1.77 [1.54, 2.02] | 2.10 [1.63, 2.71] | 1.74 [1.50, 2.01] |
RR* [95% CI*] | 1.74 [1.13, 2.69] | 1.87 [1.13, 3.12] | 1.61 [0.88, 2.92] | 1.87 [1.13, 3.12] | 1.61 [0.88, 2.92] | 1.87 [1.13, 3.12] | 1.61 [0.88, 2.92] |
When excluding 17 studies that had < 500 patients, the HR was 1.54 (95% CI 1.39-1.71), the OR was 1.84 (95% CI 1.61-2.10), and the RR was 1.61 (95% CI 0.88 - 2.92). When excluding 28 studies that had < 1,000 patients, the HR was 1.48 (95% CI 1.33-1.65), the OR, was 1.77 (95% CI 1.54-2.02), and RR was 1.61 (95% CI 0.88 - 2.92). When excluding 35 studies that had <2,000 patients, the HR was 1.46 (95% CI 1.30-1.63), OR was 1.74 (95% CI 1.50-2.01), and RR was 1.61 (95% CI 0.88-2.92). All the effect estimates of the sensitivity analysis were very similar to the overall effect estimates (see Table 2). All forest plots can be found in supplementary data.15
When assessing the funnel plots for studies reporting HR and OR, there was no suspicion of publication bias among the included studies (see Figures 1 and 2).
In this paper, we assess the impact of limiting prognosis systematic reviews to large studies when there is plethora of evidence available using a case study of assessing chronic kidney disease (CKD) as risk factor for mortality among people infected with COVID-19. Pooled results were robust using different effect estimates (HR, OR, and RR) and different cutoff value to define smaller studies (500, 1,000, and 2,000 patients).
This study has multiple strengths. First, the review included a large number of studies (75) with a large number of patients (30,566,586 participants) and included studies using different effect estimates (HR, RR, and OR). This allowed the investigators to explore the effect of the sensitivity analysis on all these effect estimates. Second, there is no established cutoff or existing guidance to determine what is considered a smaller study. Hence, we assessed the effect of excluding smaller studies using multiple cutoffs. Some reviews exclude studies that included < 10 patients3 and some used other much larger cutoffs (e.g. < 10,000 patients).18 The decision about smaller versus larger studies should consider multiple factors. It would depend on the number of studies and the participants included in the systematic review and meta-analysis. Additionally, it has to consider the specific disease or condition being assessed by the systematic review. For example, rare diseases may not yield large sample size studies, and so the definition of ‘larger’ itself may vary. Therefore, such definition should be validated using sensitivity analysis by considering different cutoffs. Third, we assessed the relation between risk of bias and study size. We did not find any intrinsic difference between smaller and larger studies included in this paper. The risk of bias table for all studies can be found in the separate systematic review.10 In this systematic review the included studies’ risk of bias was not sufficient to explain the inconsistency in results. It would have been inappropriate to exclude smaller studies if they were the studies of higher quality (less risk of bias). The lack of clear association between study size and risk of bias allowed for non-biased results after exclusion of smaller studies. Finally, we also assessed the effect of publication bias on the results. One potential explanation of why smaller studies usually have higher effect estimates than the overall effect estimate, is a phenomenon referred to as ‘small study effect’.19 Small study effect usually occurs due to publication bias. As smaller intervention studies with higher effect size have a higher chance to get published and a larger treatment effect is needed to produce statistically significant results.20 In addition, studies with higher effect size are more cited and they have a higher chance to get captured in searches. The effect of this phenomenon of studies of prognosis is not clear. A systematic review conducted in osteoarthritis studies showed that the effect estimates from larger studies (>100 patients) is smaller than effect estimates derived from smaller studies and closer to the overall effect estimates.21 Another meta-analysis showed that the effect estimates from smaller studies had an exaggerated benefit when compared to larger studies.22 We did not have suspicion of publication bias when assessing the funnel plots visually. Studies with different effect estimates were distributed evenly for different study sizes.
The study has few limitations. We rely on assessing a single case example. The finding of this study me not be representative of the effect of excusing smaller studies in other reviews. However, We believe the large numbers we incorporated allow for making inference with enough certainty. Another limitation is that the case related to prognostic studies, which may not be generalizable to other types of systematic reviews. Additional empirical evidence and possibly simulation studies may help increasing the generalizability of these findings.
The conduct of smaller studies has been controversial as they lack statistical power to test a hypothesis and might expose patients to unnecessary harm and are more prone to methodological errors.22 Similar to what we are suggesting, some researchers suggested that smaller studies should be excluded from systematic reviews as they provide minimal power to the overall effect and they tend to potentiate publication bias.23 A paper by Stanley suggested using only the 10% most precise studies in the meta-analysis.24 Another study by Turner found little impact of underpowered studies on the overall effect in Cochrane reviews if at least two substantially powered studies were included.25 However, the inclusion of small studies in meta-analysis has been justified by the fact that these studies when combined will produce a significant effect besides their importance in evaluating rare diseases.6 This manuscript doesn’t aim to discourage the conduct of smaller prognostic studies. The conduct of this type studies is encouraged for any study size. In this manuscript we address the validity of excluding smaller studies in cases of plethora of evidence.
Including only large studies has the risk of excluding potentially relevant evidence reported in small studies. For this reason, other solutions to accelerate the review process include the implementation of automated systems in the review process.26 A study showed that a semi-automated approach using a machine learning software reduced the consumption of time with low risk of missing articles.27 Many automated software’s are now available to assist reviewers in conducting systematic reviews, however, their use has not yet been widely adopted by reviewers.26 Additionally, excluding smaller studies may or may not have implications on decision making. This depends on factors such as the prevalence of the condition, baseline characteristics of patients, nature of the people important outcome. In this case study, the difference between the overall effect estimates when including all studies despite sample sizes and the effect estimates when excluding smaller sties was very small and statistically insignificant.
Our results showed no major difference between the overall effect estimates from studies of all sample sizes and that obtained after excluding smaller studies. This is an exploratory work which shows anecdotal evidence that it might be acceptable to use evidence from larger studies for prognostic systematic reviews. This is especially relevant with the exponential increase in the number of publications (e.g., during the COVID-19 pandemic), as a matter of practical convenience. Although this may be true in cases of plethora of evidence (e.g., during the COVID-19 pandemic), caution must be made when using this evidence for policy and decision making. Additional work (e.g., simulation studies) is needed to further explore the effect of excluding smaller studies in multiple settings.
Figshare. Small studies in systematic reviews: To include or not to include? https://doi.org/10.6084/m9.figshare.21739295.v1. 15
The project contains the following underlying data:
• Data supplement 1 Search Strategy Small Studies.pdf. (Search strategy for all databases in this study).
• Data supplement 2 PRISMA flowchart Small studies.pdf. (Selection process flowchart for this study).
• Data Supplement 3 Forest plots.docx. (Nine forest plots of comparison).
Data are available under the terms of the Creative Commons Zero “No rights reserved” data waiver (CC0 1.0 Public domain dedication).
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Partly
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Statistics. Nephrology.
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | |
---|---|
1 | |
Version 1 12 May 23 |
read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)