Introduction

As of 14 April 2022, SARS-CoV-2 infection has killed over 6.1 million of individuals and has posed a serious threat to public health worldwide [1, 2]. There is high heterogeneity in clinical severity of SARS-CoV-2 infection, ranging from asymptomatic to fatal [3]. Solid evidence from epidemiological studies reported that type 2 diabetes (T2D) is strongly associated with COVID-19 adverse outcomes [4,5,6,7,8,9,10]. But whether the occurrence of T2D and severe COVID-19 is coincidence or causally associated has not been determined due to the limited number of cases and follow-up time, and whether this association is attributed to the correlation between T2D and other comorbidities remains unclear.

Therefore, a better understanding of the pathogenic mechanism underlying the co-occurrence of T2D and COVID-19 severity is warranted, which may help to allocate proper medical resource, make vaccination decisions for patients with T2D as well as develop effective therapeutic strategies for severe COVID-19. Recently, shared genetic architecture can provide insight into the biological basis of the comorbidity phenomenon. For example, Fadista et al. observed shared genetic overlap between idiopathic pulmonary fibrosis and severe COVID-19 [11]. A recent study detected strong genetic correlations between gastro-esophageal reflux disease and COVID-19 severity [12]. Both T2D and severe COVID-19 are moderately heritable, with estimates from the literature ranging from approximately 18.0% for T2D [13] and 6.5% for severe COVID-19 [14], respectively. Moreover, co-shared genetic etiology between T2D and numerous psychiatric disorders characterized by both internalizing and externalizing symptomatology that also shared molecular genetics with COVID-19 outcomes [15, 16]. Therefore, we speculated that genetic determinants contribute to the comorbidity of T2D with severe COVID-19.

To fill this urgent gap, we utilized large-scale genome-wide association study (GWAS) datasets to appraise genetic correlations, potential causality, and shared genetic components between T2D and COVID-19 severity.

Methods

Data source

The dataset of T2D was obtained from the most recent meta-analysis of T2D conducted by Mahajan et al. based on 32 European-descent GWASes, reaching 898,130 individuals [13]. Meanwhile, the T2D-unadjusted for BMI and T2D-adjusted for BMI datasets are subsets of the GWASes meta-analysis of T2D conducted by Mahajan et al. [13]. As obesity is an established risk factor for both T2D and severe COVID-19, we selected the BMI-adjusted T2D-association data for analysis in the current study. Summary genetic association estimates for COVID-19 severity (hospitalization and severe disease) were extracted from largest GWAS meta-analysis of COVID-19 to data, by the COVID-19 Host Genetics Initiative (HGI, round 5, shared publicly on 18 January 2021) [17]. The hospitalized outcome compared 9986 hospitalized COVID-19 patients (vs. 1,877,672 general population). The severe disease outcome compared 5101 confirmed as "very severe respiratory" cases (vs. 1,383,241 general population). The details of these GWASes were elaborated in Supplementary Table 1 and further information is provided elsewhere [13, 17, 18]. To minimize the potential bias due to ethnic heterogeneity, participants included in the above summary statistics were restricted to European ancestry only. Since all analyses were based on publicly available data, no ethical approval was required for present study.

Linkage disequilibrium score regression (LDSC) of cross-trait genetic correlation

We applied LDSC to estimate single nucleotide polymorphism (SNP)-based heritability (h2) of T2D and COVID-19 severity (including hospitalization and severe disease) as well as their genetic correlation (rg) [19, 20]. Briefly, LDSC carried out a weighted linear model by regressing the cross products of z statistics for two phenotypes on the LD scores across all available genetic variants. We used the pre-computed LD scores of European samples from the 1000 Genomes Project provided in the website of this software (http://github.com/bulik/ldsc). The regression slope provides an estimation of genetic correlation between T2D and COVID-19 severity.

Assessing the causality of T2D with COVID-19 severity

Selection of genetic instruments

Independent T2D-related SNPs (p < 5 × 10–8) were initially selected as the instrumental variables (IVs) for MR analyses. Where an initial genetic instrument was not present in the outcome GWAS datasets (here, COVID-19 hospitalization and severe disease), it can be replaced by a suitable proxy variant that were in high LD with the initial variant (r2 > 0.6). Furthermore, we computed R2 of each T2D-related SNP and calculated the F statistic with the following formula: F statistic = R2 × (SampleSize − 2)/(1 − R2) [21, 22]. A threshold of F statistic over 10 indicated lower risk of weak instrument bias [22].

Mendelian randomization analyses

To investigate causality of T2D with COVID-19 severity, two-sample MR analyses were employed across four different MR methods: inverse-variance weighted (IVW) [23, 24], weighted-median [25], weighted mode [26] and MR Egger [27] methods. To ensure the robustness of results, Cochran Q test and MR-Egger intercept test were used to examine heterogeneity and directional horizontal pleiotropy, respectively. Leave-one-out analysis and MR-Pleiotropy Residual Sum and Outlier (MR-PRESSO) were used to identify outlier IVs reflecting pleiotropic bias [28]. We further applied the PhenoScanner v2 [29] to check whether any of the selected IVs for T2D were associated (p < 1 × 10–5) with other phenotypes at risk of affecting COVID-19 severity, restricting to obesity and vascular or heart problems diagnosed by doctor: high blood pressure.

All MR analyses were carried out using the R packages ''TwoSampleMR'', ''MendelianRandomization'' and ''MR-PRESSO'' [30, 31]. A p value less than 0.05 was considered as statistically significant.

Gene-based association and enrichment analysis

Gene-based association analysis was conducted using Multi-marker Analysis of Genomic Annotation (MAGMA) based on SNP-wise mean model [30]. SNPs were assigned to a gene if they located within the gene body or mapped within the 20 kb upstream or downstream region. The Phase 3 of 1000 Genomes European population was used for reference panel to calculate the LD information. The maximum and minimum numbers of permutations were set to ten thousand and ten for per gene, respectively. The gene-based p value was calculated based on asymptotic sampling distribution. Potentially pleiotropic genes that were significantly associated with both T2D and COVID-19 severity with p values less than 0.05 were identified for the subsequent gene set enrichment analysis. The Gene Ontology (GO) enrichment analysis was carried out using the "clusterProfiler" package with the ontology term "BP" for biological process [31]. p values were adjusted for multiple comparisons using the Benjamini–Hochberg method.

Results

Genetic correlation of T2D with COVID-19 severity

The SNP-based heritability was estimated to 17.53% (SE = 0.013) for T2D, 0.20% (SE = 0.0005) for COVID-19 hospitalization, and 0.35% (SE = 0.0007) for COVID-19 severe. Significant positive genetic correlations between T2D and COVID-19 hospitalization (rg = 0.156, SE = 0.057, p = 0.005) or severe COVID-19 (rg = 0.155, SE = 0.057, p = 0.006) were detected with the LDSC software.

Causal relationship assessment between T2D and COVID-19 severity

A total of 191 independent SNPs were included as IVs for T2D, the F-statistic values for each SNP was above the threshold 10 (ranging from 28 to 1280, with means of 70) (Supplementary Table 2). Genetically predicted T2D was not associated with COVID-19 hospitalization or severe disease across four MR approaches (Table 1). Furthermore, the testing of Cochran’s Q test suggested evidence of heterogeneity, and MR-PRESSO detected horizontal pleiotropy outliers across estimates of included SNPs for COVID-19 hospitalization (Supplementary Table 3 and Fig. 1). After removing the outliers, no heterogeneity for T2D remained and consistent null causal association was observed (all p > 0.05). In addition, leave-one-out analysis did not detect any leverage points with significant influence (Supplementary Fig. 2). We further utilized the PhenoScanner tool to check if the SNPs used in this MR study were associated with any other phenotypes. Eighteen SNPs were associated with at least one of obesity and high blood pressure as shown in Supplementary Table 4. After removing these SNPs, non-causal effect estimation remained for COVID-19 hospitalization and severe disease (all p > 0.05) (Table 1).

Table 1 Causal relationships of T2D on COVID-19 severity estimated by approaches of IVW, weighted median, weighted mode and MR Egger
Fig. 1
figure 1

Scatter plots for effect sizes of SNPs for T2D and those for COVID-19 hospitalization and severe disease. The x-axis represents the effect size of SNPs on T2D; the y-axis represents the effect size of SNPs on COVID-19 hospitalization (A) and severity (B). Colors of fitted line represents for four MR approaches. T2D type 2 diabetes, SNPs single nucleotide polymorphisms, MR Mendelian randomization

Gene-based analysis and functional enrichment analyses

MAGMA gene-based association analysis identified 9 005, 1 438 and 1 393 genes nominally significant (pgene < 0.05) in the T2D, COVID-19 hospitalization, and severe disease GWAS level association results, respectively. Then, by leveraging gene-based test outputs, we looked for overlapping genes between T2D and COVID-19 hospitalization, and severe disease, and revealed a total of 770 and 784 significantly enriched genes shared by the two disorders at pgene < 0.05, respectively. Further GO enrichment analysis identified 58, 16 enriched biological pathways between T2D and COVID-19 hospitalization, severe disease at pFDR less than 0.2, respectively, including response to type I interferon (GO:0034340), glutathione derivative metabolic process (GO:1901685), glutathione derivative biosynthetic process (GO:1901687), regulation of ATPase activity (GO:0043462), natural killer cell activation involved in immune response (GO:0002323), interleukin-7-mediated signaling pathway (GO: 0038111), etc. (Supplementary Tables 5–6).

Discussion

By leveraging the genetic correlation estimation, MR analysis, gene-based association and enrichment analyses, our results provided insights into the relationship between T2D and severe COVID-19.

We detected positive genetic correlations and further identified shared molecular mechanisms between T2D and severe COVID-19. Some of the molecular mechanisms that lead to T2D are also crucial in the progression of COVID-19. In agreement with this, Byun et al. utilized the summarized data from the UK Biobank observed robust positive genetic correlations between diabetes and COVID-19 severity (rg = 0.254) or hospitalization (rg = 0.309) [32]. We found that genetically predicted T2D did not increase the risk of COVID-19 severity, in keeping with earlier MR studies employing data from the previous data freeze (COVID-19 HGI, release 3 and 4, 2020) with smaller sample size for hospitalized as well as severe COVID-19 cases [33, 34]. Our current work also improves on previous MR investigations, as multiple complementary MR techniques and sensitivity analysis (MR-PRESSO and manually pruning pleiotropic variants) were performed to appraise the robustness of causal inference.

The relation between T2D and severe COVID-19 is more likely to be coincidental rather than causal, some other mechanisms of connection may account for their co-occurrence. This notion is supported by the results of our gene-set analysis, which identified common shared etiological pathways underlying T2D and COVID-19 severity such as response to type I interferon, glutathione derivative metabolic process, and glutathione derivative biosynthetic process. T2D and severe COVID-19 are hypothesized to be linked by several pathophysiological mechanisms including the role of inflammation, altered immune status, glucose homeostasis and activation of the renin–angiotensin–aldosterone system [35]. Notably, inflammation is central to the aetiology of the two traits [36]. The type I interferon (INF, INF-α and INF-β) response induce the expression of various interferon-stimulated genes (ISGs) that confer antiviral activities to host cells, but the aberrant and inappropriate activation can exacerbate inflammation [37], and this response was significantly enriched biological mechanism underpinning T2D and COVID-19 hospitalization and severe disease. Delayed but exaggerated type I INF responses, associated with hyperinflammation and contributing to severe progression of COVID-19, were observed in patients with severe COVID-19 [38]. Specifically, in early stage of infection, SARS-CoV-2 inhibits the host type I INF responses, resulting in delayed virus clearance and higher SARS-CoV-2 viral loads, which makes patients have a higher risk of developing severe disease [39]. A recent study reported that high glucose suppressed type I INF production and subsequent signaling in human peripheral blood mononuclear cells [40]. Another study also found that dendritic cells, which are the “professional” IFN producing cells, are poor producers of INF-α in human diabetes [41]. Furthermore, Bhavya et al. suggested that host microRNAs play a pivotal role in linking the diabetes and COVID-19 severity [42].

"Glutathione derivative metabolic process" and "glutathione derivative biosynthetic process" are well confirmed in the pathway of T2D and severe COVID-19 illness in present research. Glutathione (GSH), a thiol-containing tripeptide made up of L-glutamate, cysteine and glycine, exerts function against oxidative stress in most living organisms, based on its redox buffering action and abundance in the cell [43, 44]. The main cause of oxidative stress is an imbalance between creation and buildup of reactive oxygen species (ROS) [45]. Angiotensin-converting enzyme 2 (AEC2), the receptor of SARS-CoV-2, forms angiotensin (Ang) 1–7 from Ang II [46]. High Ang II concentrations and low Ang 1–7 concentrations lead to oxidative stress and, consequently, causes oxidation of the cysteine residues in the peptidase domain of receptors ACE2 as well as the receptor binding domain of proteins SARS-CoV-2 spike protein, being associated with more severe forms of COVID-19 [45]. T2D was widely considered to be associated with oxidative damage, and characterized by increased oxidized glutathione (GSSG/GSH ratio and decreased GSH content in different tissues [47]. Weaken antioxidant defense system exposed to the pathogenesis of T2D, might help to explain why patients with T2D are more likely to develop severe COVID-19 illness.

The major advantage of our current study is the efficient use of a series of genetic statistical techniques to address an important clinical association between T2D and COVID-19 severity at the molecular genetic level. However, several limitations ought to be noted. First, according to lack of the individual genetic data, we were unable to quantify the proportion of sample overlap between T2D and COVID-19 severity datasets, which may confound LDSC and MR analyses. Second, study participants included in the COVID-19 severity GWAS meta-analysis were not screened for T2D at baseline. In MR studies, the existence of exposure in the outcome dataset may have an impact on the causal estimates. However, this is a general restriction of two-sample MR analyses and is inevitable without individual-level data [48]. Third, since all analyses herein utilized summary data from Europeans and focused on severe COVID-19, the associations may not apply to other populations or patients with asymptomatic to moderate COVID-19. Lastly, genetic factors only explained a tiny fraction of T2D etiology, the causality of T2D and severe COVID-19 cannot be fully ruled out with these analyses.

In conclusion, our findings further confirm the comorbidity of T2D and COVID-19 severity, but a non-causal impact of T2D on severe COVID-19. Shared genetically modulated molecular mechanisms underlying the co-occurrence of the two disorders are crucial for identifying therapeutic targets.