A Post-GWAS Analysis of the Shared Genetic Architecture Between COVID-19 and Coronary Artery Disease

Ali, Muhammad Sarfraz; Haider, Waseem; Aziz, Sana; Mohammad, Anwaruddin; Manichaikul, Ani; Shi, Weibin

doi:10.3390/ijms27094132

Open AccessArticle

A Post-GWAS Analysis of the Shared Genetic Architecture Between COVID-19 and Coronary Artery Disease

by

Muhammad Sarfraz Ali

^1,2

,

Waseem Haider

²

,

Sana Aziz

³

,

Anwaruddin Mohammad

⁴

,

Ani Manichaikul

⁵ and

Weibin Shi

^1,*

¹

Department of Radiology and Medical Imaging, University of Virginia, Charlottesville, VA 22903, USA

²

Department of Biosciences, COMSATS University, Islamabad 45550, Pakistan

³

Department of Zoology, Faisalabad Campus, University of Education, Lahore 38000, Pakistan

⁴

Bioinformatics Core, University of Virginia School of Medicine, Charlottesville, VA 22903, USA

⁵

Department of Genome Sciences, University of Virginia, Charlottesville, VA 22903, USA

^*

Author to whom correspondence should be addressed.

Int. J. Mol. Sci. 2026, 27(9), 4132; https://doi.org/10.3390/ijms27094132

Submission received: 4 March 2026 / Revised: 29 April 2026 / Accepted: 29 April 2026 / Published: 5 May 2026

(This article belongs to the Collection Feature Papers in Molecular Pathology, Diagnostics, and Therapeutics)

Download

Browse Figures

Review Reports Versions Notes

Abstract

An individual’s host genetics influence its susceptibility to both COVID-19 and coronary artery disease (CAD). We analyzed large-scale GWAS datasets encompassing 7.7 million SNPs to identify shared genetic architecture between the two diseases. We identified 24 pleiotropic risk loci for both COVID-19 and CAD, with three loci (1p31.1, 8p21.3, and 18q11.2) showing strong evidence for a single shared causal variant. Loci in the 8p21.3 and 18q11.2 regions showed a bidirectional causal association: COVID-19 to CAD or vice versa, while the 1p31.1 locus only showed a CAD to COVID-19 unilateral casual association in a Mendelian randomization analysis (GSMR). A fine mapping analysis of the three loci identified three lead pleiotropic variants (rs7515509, rs8192330, and rs4800403). The variant rs7515509 was spatially associated with AK5, PIGK, USP33, and ZZZ3; rs8192330 with DMTN, PIWIL2, and several other genes; and rs4800403 with GATA6 and CTAGE1. Transcriptomic profiling of peripheral blood mononuclear cells (PBMCs) from COVID-19 patients validated proxitropic variants (rs8192330 and rs4800403) with distinct expression signatures and prioritized DMTN and PIWIL2 as the likely causal genes. Overexpression of DMTN has been linked to the heme metabolism hallmark, disrupted iron distribution in COVID-19 patients with comorbid CAD, and subsequent stress erythropoiesis, oxidative stress, immunological dysfunction, and altered wound healing, while a lower expression of PIWIL2 has been observed in the cytoplasmic translation and regulation of mRNA metabolism. In conclusion, we identified shared genetic components for COVID-19 and CAD and prioritized DMTN and PIWIL2 as the likely causal genes for the observed shared genetic risk. COVID-19 may act as an acute stressor that unmask or accelerates underlying CAD.

Keywords:

COVID-19; coronary artery disease; pleiotropy; proxitropy; GWAS; Mendelian randomization; GSMR; post-GWAS; DMTN; PIWIL2

1. Introduction

COVID-19, a global pandemic infectious disease caused by the SARS-CoV-2 virus, has infected 5.6 million people and resulted in over five million deaths as of 31 January 2022 [1]. It may progress to long COVID-19, with symptoms persisting for weeks, months, or even years after the initial acute phase of illness. Long COVID-19 is an often-debilitating illness that occurs in at least 10% of the cases or approximately 65 million individuals worldwide as of 2023 [2]. Beyond the respiratory symptoms, more than 200 symptoms have been identified, spanning multiple organ systems [2], specifically cardiovascular, neurological, pulmonary, and psychological ones [3]. Conversely, pre-existing obesity, heart failure, and ischemic heart disease are significant risk factors for increased susceptibility, severity of coronavirus infection, and the development of long COVID-19 [4]. Epidemiological data have revealed that COVID-19 infection is associated with a markedly increased rate of major adverse cardiovascular and thrombotic events within 2 years of infection [5,6,7,8,9,10]. The primary types of complications include myocarditis, acute coronary syndrome (ACS), hypotension, heart failure, shock, and sepsis [11]. Likely reasons for the increased risk include endothelial dysfunction [12,13], cytokine storms [14,15,16], leukocytes–platelets crosstalk [17,18,19], dyslipidemia [20,21], hyperglycemia [22,23], and oxidative stress [24].

Genome-wide association studies (GWAS) and subsequent meta-analysis have identified over 50 loci associated with the susceptibility and the severity of COVID-19 [25,26], including loci on chromosome 3p21 and the ABO locus on chromosome 9q34, which are associated with atherosclerotic plaque rupture and myocardial infarction [27,28]. Thus, there is a probability that the increased risk for major cardiovascular events in COVID-19 patients may stem from a shared genetic architecture. Specifically, the genetic loci LZTFL1, ABO, ILRUN, and CACFD1 may simultaneously influence both COVID-19 severity and coronary artery disease (CAD) risk [29]. Subsequent studies have verified the significance of the ABO locus in the genetic interaction between COVID-19 and CAD [9,30]. However, the precise mechanisms and causal pathways modulating this interplay remain poorly understood. Here, we investigated the commonalities in the genetic architecture of COVID-19 and CAD to identify novel pleiotropic loci. By integrating cross-trait meta-analysis, generalized summary data-based Mendelian randomization (GSMR), and gene-based association testing through mBAT-Combo, we prioritized candidate genes. These were further validated using bulk RNA-seq-based expression profiling in COVID-19 patients and gene set enrichment analysis to elucidate the shared molecular pathology underlying both conditions. A schematic overview of this study is presented in Figure S1.

2. Results

2.1. Global Genetic Correlation Between COVID-19 and CAD

We used the linkage disequilibrium (LD) score regression to assess bivariate genetic correlations between CAD and three COVID-19 clinical phenotypes: (1) critically ill cases, (2) moderate to severe hospitalized cases, and (3) general SARS-CoV-2 reported cases. As shown in Table 1, significant genetic correlations were found between critically ill COVID-19 patients and CAD for both European (EUR) population (R^G = 0.1028; P_R^G < 2.03 × 10⁻⁷) and South Asian (SAS) (R^G = 0.0938; P_R^G < 5.70 × 10⁻⁵) ancestry including Lahore Punjabi in Pakistan. Moderate to severe hospitalized COVID-19 patients showed the strongest correlations (EUR: R^G = 0.1516; P_R^G < 2.67 × 10⁻¹⁷; SAS: R^G = 0.1576; P_R^G < 1.07 × 10⁻¹¹). For general SARS-CoV-2 reported cases, correlations were also significant (EUR: R^G = 0.1219; P_R^G < 4.042 × 10⁻¹¹; SAS: R^G = 0.1344; P_R^G < 3.68 × 10⁻⁸). Among the three COVID-19 categories, the moderate to severe hospitalized phenotype showed the strongest genetic association with CAD in both ancestries. These findings indicate that individuals with COVID-19 are at the greatest risk of developing CAD.

2.2. Coincident Loci for COVID-19 and CAD

We analyzed 7685,407 GWAS-curated SNPs identical to both GWAS datasets to identify coincident genomic risk loci for COVID-19 and CAD using colocalization analysis. Each locus containing lead SNPs and secondary signals within a 500 kb window was assessed using the Approximate Bayes Factors (ABF) method [31]. We identified 24 coincident risk loci significantly associated with both diseases, defined by a combined posterior probability (PPH₃ + PPH₄) of ≥ 0.7 (Table S1). Of these, 21 loci were associated with distinct causal variants for each disease, whereas only 3 loci showed evidence for a single shared pleiotropic variant (Table S1). Among the 21 coincident regions, loci harboring SLC6A20, LZTFL1 (3p21.31), IRF1, IL5 (5q31.1), ABO (9q34.2), and IFNAR2 (21q22.11) were predominantly associated with COVID-19, while loci containing CFDP1 (16q23.1), COL1A1 (17q21.33), LDLR (19p13.2), RSPH6A, and APOE (19q13.32) were primarily linked to CAD (Table S1).

Subsequent HyPrColoc analysis provided strong statistical evidence for pleiotropy at 1p31.1, 8p21.3, and 18q11.2 loci, with posterior probabilities of ≥ 0.75 (Table S2). These loci are located at chromosomes 1:77,695,983–78,192,445 (1p31.1; PP_(HyPrColoc) = 0.85, PIGK) with lead signal rs7515509 (p < 2.94 × 10⁻¹²), 8:21,773,384–22,270,797 (8p21.3; PP_(HyPrColoc) = 0.90, DMTN and PIWIL2) with lead signal rs8192330 (p < 2.83 × 10⁻⁷), and 18:19,748,905–20,248,798 (18q11.2; PP_(HyPrColoc) = 0.97, GATA6 and CTAGE1) with lead signal rs4800403 (p < 1.27 × 10⁻⁷).

LocusCompare plots, which visualize the effect sizes of genetic variants from two GWAS datasets for a specific genomic region, confirmed colocalizations of lead variants for CAD and COVID-19 at these loci in both European and South Asian populations, including the Lahore Punjabi cohort in Pakistan (Figure 1A–C; Figure S2A–C). These results are summarized in a Venn diagram (Figure 1D; Table S3). The genetic variants at the three loci constituted a 99% credible set (Table S9).

2.3. Local Genetic Correlations Between COVID-19 and CAD

Given the significant genetic correlation between the moderate to severe hospitalized COVID-19 phenotype and CAD, we conducted local genetic correlation analyses at coincident pleiotropic loci. At the 1p31.1 locus, a significant correlation was observed in the EUR population (R^G = 0.6581, P_R^G < 2 × 10⁻⁴), though no significant correlation was detected in the SAS population. At the 8p21.3 locus, strong inverse correlations were found for both EUR (R^G = −0.8112; P_R^G < 5.45 × 10⁻⁶) and SAS ancestries (R^G = −0.8788; P_R^G < 6.92 × 10⁻⁸). Finally, at the 18q11.2 locus, positive correlations between COVID-19 and CAD were maintained in both groups (R^G = 0.4767; P_R^G < 7.87 × 10⁻⁵ for EUR population and R^G = 0.9226; P_R^G < 7.92 × 10⁻⁶ for the SAS ancestry) (Table 1).

2.4. Coincident Pleotropic Signals Between COVID-19 and CAD

In the EUR population, fine mapping prioritized rs2133204 at the 1p31.1 locus as the primary signal for CAD. This signal overlapped with the lead signal rs7515509 for COVID-19 in critically ill and moderate to severe hospitalized patient groups (PPH_4(abf) = 0.70 and 0.94, respectively; Table 2; Figure 2A,B). A strong pairwise linkage disequilibrium (D′ = 0.97, p < ×10⁻⁴) confirmed that these overlapping variants are genetically linked (Figure S4A; Table S4).

At the 8p21.3 locus, the signal rs8192327 for the critically ill group colocalized with the CAD signal rs56390102 (PPH_4(abf) = 0.70; D′ = 0.64, p < ×10⁻⁴), and the lead signal rs8192330 for the moderate to severe hospitalized COVID-19 group colocalized with CAD signal rs56408342 (PPH_4(abf) = 0.85; D′ = 0.98, p < ×10⁻⁴; Table 2; Figure 2C,D; Figure S4B, Table S4). Notably, no significant colocalization was observed for the general SARS-CoV-2 group at both 1p31.1 and 8p21.3 loci.

At 18q11.2, we observed consistent evidence of colocalization across all COVID-19 phenotypes. The lead variant rs4800403 from the critically ill COVID-19 group colocalized with CAD signal rs16967171 (PPH_4(abf) = 0.98; D′ = 0.98, p < ×10⁻⁴; Table 2; Figure 2E; Figure S4C, Table S4), while the same variant rs4800403 in the moderate to severe hospitalized COVID group overlapped with CAD variant rs3813126 (PPH_4(abf) = 0.93; D′ = 0.98, p < ×10⁻⁴; Table 2; Figure 2F; Figure S4C, Table S4). Furthermore, the general SARS-CoV-2 group showed strong evidence of colocalization between rs16967171 and CAD (D′ = 1.0, p < ×10⁻⁴; PPH_4(abf) = 0.99) (Table 2; Figure 2G; Figure S4C, Table S4).

Pleiotropic signals were fine mapped through the Coloc Bayesian framework, integrated with the Sum of Single Effects (SuSiE) regression model at the colocalized loci (1p31.1, 8p21.3, and 18q11.2). LD metrices were constructed from the 1000 Genomes Project (1KGP) for the EUR and SAS populations. The analysis prioritized causal signals for both traits across the three COVID-19 phenotype groups. Colocalization evidence is quantified using posterior probabilities (PP) derived from Approximate Bayes Factors (ABF) as described in the methods below.

In the SAS ancestry, including the Lahore Punjabi population in Pakistan, we identified a shared genetic signal rs56390102 from the critically ill COVID-19 group and CAD at the 8p21.3 locus, with strong evidence of colocalization (PPH_4(abf) = 0.74; D′ = 1.0, p < 1 × 10⁻⁴; Table 2; Figure 2H; Figure S4D; Table S4). However, no significant colocalization was observed at this locus for either the moderate to severe hospitalized or the general SARS-CoV-2 groups.

At the 18q11.2 locus, the lead signal rs4800403 from both the critically ill and the moderate to severe hospitalized phenotypes overlapped with CAD signals rs16967171 and rs12958355, respectively, with high confidence (PPH_4(abf) = 0.98 and 0.74, respectively; D′ = 0.92, p < ×10⁻⁴; Table 2; Figure 2I,J; Figure S4E and Table S4). In the general SARS-CoV-2 group, rs16967171 colocalized with the CAD signal rs12958355 (PPH_4(abf) = 0.99; D′ = 0.92, p < ×10⁻⁴; Table 2; Figure 2K; Figure S4E and Table S4). No evidence of colocalization between any COVID-19 group and CAD was detected at the 1p31.1 locus for the moderate to severe hospitalized group and at the 8p21.3 locus for the general SARS-CoV-2 groups.

In the European samples, the lead signal rs7515509 showed strong linkage disequilibrium with neighboring associated signals at the 1p31.1 locus (Figure S4A; Table S4). At the 8p21.3 locus, the lead signal rs8192330 showed high genetic linkage with all identified signals from both the COVID-19 and CAD cohorts. (Figure S4B and Table S4). At the 18q11.2 locus, the lead signal showed significant linkage disequilibrium with proxy (secondary) signals associated with either COVID or CAD (Figure S4C and Table S4). For the SAS population, a similar LD pattern was observed for all prioritized signals for both COVID and CAD (Figure S4D and Table S4).

2.5. Causal Association Between COVID-19 and CAD

2.5.1. Forward Mendelian Randomization (MR) Analysis

Forward GSMR analysis with curated GWAS datasets revealed a statistically significant causal effect of CAD on COVID-19 risk across three genomic regions (Figure 3; Table 3). Significant causal estimates were observed at 1p31.1 (

\hat{β}

_xy = 4.92, P_{Adj(forward-GSMR)} < 4.21 × 10⁻³), 8p21.3 (

\hat{β}

_xy = −0.61, P_{Adj(forward-GSMR)} < 3.19 × 10⁻²), and 18q11.2 (

\hat{β}

_xy = 2.07, P_{Adj(forward-GSMR)} < 2.30 × 10⁻⁶). The HEIDI (Heterogeneity in Dependent Instruments) outliers, the genetic variants that violate a core assumption of MR, identified during the GSMR analysis, including rs1909203, rs6995980, rs116825679, rs73225858, rs177990, rs579332, rs678308, and rs542228, were excluded to ensure that the instrumental variables would provide a less biased and more reliable estimate on the direct causal relationship between CAD and COVID-19.

2.5.2. Reverse MR Analysis

Reverse GSMR analysis revealed significant causal effects of COVID-19 on CAD risk at two genetic loci: the 8p21.3 locus (

\hat{β}

_zx = 0.10, P_{Adj(reverse-GSMR)} < 2.20 × 10⁻⁴) and the 18q11.2 locus (

\hat{β}

_zy = 0.15, P_{Adj(reverse-GSMR)} < 1.72 × 10⁻⁷) (Figure 3 and Table 3). In contrast, the 1p31.1 locus (Chr1:77695983–78192445) showed no significant association (

\hat{β}

_zy = 2.4 × 10⁻³, P_{Adj(reverse-GSMR)} < 0.83) (Figure 3; Table 3). The HEIDI outliers identified during the reverse GSMR analysis, including rs116825679, rs56408342, rs73225858, rs177990, rs579332, rs678308, and rs542228, were excluded from the analysis.

2.6. Likely Causal Genes at the Co-Incident Loci Shared by COVID-19 and CAD

We applied the mBAT-Combo framework using GENCODE (v.40) for the hg38 reference genome to prioritize candidate genes driving the shared molecular pathology of COVID-19 and CAD across the curated GWAS datasets. At the 1p31.1 locus, we identified four genes (AK5, PIGK, USP33, and ZZZ3), spatially associated with the lead signal rs7515509 (p < 2.94 × 10⁻¹²; P_mBAT-Combo < 9.08 × 10⁻⁶) as likely candidate genes (Figure 4, Table 4). At the 8p21.3 locus, we found eighteen genes (DMTN, FHIP2B, DOK2, XPO7, NPM2, FGF17, NUDT18, HR, HRURF, REEP4, LGI3, SFTPC, BMP1, PHYHIP, POLR3D, PIWIL2, SLC39A14, and PPP3CC) spatially associated with the lead rs8192330 (p < 6.46 × 10⁻⁶; P_mBAT-Combo < 2.83 × 10⁻⁷). And at the 18q11.2 locus, we found CTAGE1 and GATA6 as candidate genes spatially associated (p < 1.27 × 10⁻⁷; P_mBAT-Combo < 1.43 × 10⁻⁶) with the pleotropic signal rs4800403 (Figure 4, Table 4). LocusZoom plots visualized their genomic location, statistical significance, gene locations, and LD patterns of SNPs simultaneously (Figure 4, Table 4).

2.7. Expression Profiling of COVID-19 Cases and Candidate Genes Prioritization

Following normalization, principal component analysis (PCA) confirmed the effective sample mitigation and consequently 36 COVID-19 and 37 healthy samples, comprising 60,603 observations (genes) were retained for further analysis (Figure 5A). No batch correction was applied, and no obvious batch-related structure was observed in PCA. DESeq2 identified 3728 differentially expressed genes in PBMCs with log2 fold-change values ranging from −6.72 to 11.34(P_(Adj)-DESeq2 < 0.05; |log2FC| ≥ 1 for up and |log2FC| ≤ −1 for downregulated genes, respectively), (Table S5). Of the significant genes, 1963 genes were upregulated and 1765 were downregulated in COVID-19 patients (Table S5). Cross-matching the differentially expressed genes (Table S5) with our prioritized genes (Figure 4, Table 4) validated DMTN and PIWIL2 as the top candidate genes (Table 5, Table S5). DMTN (P_(Adj)-DESeq2 < 1.60 × 10⁻²⁵, log₂FC = 2.42) was upregulated 5.35-fold, and the PIWIL2 (P_(Adj)-DESeq2 < 4.21 × 10⁻², log₂FC = −2.18) was downregulated 4.53-fold in COVID-19 patients (Figure 5B, Table 5) (Table S5).

Expression profiling further validated proxitropy, a phenomenon where a signal variant is associated with the expression of multiple, neighboring genes. We observed proxitropic associations of pleotropic variant rs8192330 (p < 2.83 × 10⁻⁷) with the expression of DMTN (P_(Adj)-DESeq2 < 1.60 × 10⁻²⁵) and PIWIL2 (P_(Adj)-DESeq2 < 4.21 × 10⁻²) at the 8p21.3 locus (Figure 5B; Table 5), and the variant rs7515509 (p < 2.94 × 10⁻¹²) with the expression of PIGK (P_(Adj)-DESeq2 < 2.23 × 10⁻²) at 1p31.1 (Table S5).

2.7.1. Hierarchical Clustering

We observed that clear transcriptomic separation between healthy and COVID-19 samples was not evident, except when analyzed using clustering_distance_cols = “binary”. The dendrogram (Figure 5C) reveals how genes are hierarchically clustered based on the similarity of their expression patterns across all samples. Genes that are “closer” together on the tree branches have more similar expression levels between the two clinical groups of subjects. Heatmap shows DMTN is upregulated and PIWIL2 is down-expressed in COVID-19 as compared to healthy.

2.7.2. Gene Set Enrichment Analysis and Pathways Altered by COVID-19 Infection

Gene set enrichment analysis (GSEA) of PBMC transcriptomic data showed significantly altered pathways associated with DMTN (virtually all upregulated), and PIWIL2 (virtually all downregulated) in COVID-19 patients compared to healthy (Figure 6A–C; Tables S6 and S8). DMTN is involved in several key biological pathways, including heme metabolism, and multiple gene ontology biological processes (GOBPs). These include Fe ^{+ +}/Ca ^{+ +} homeostasis, intracellular ion homeostasis, erythrocyte homeostasis, fibroblast migration, wound healing and its regulation, and regulation of cytoskeleton organization (Figure 6A,B; Tables S6 and S7).

In the upregulated heme metabolism HALLAMARK pathway, the genetic signatures responding to the glutathione pathway (SLC7A11, GCLM, NCOA4, and EPOR) (Table S6), and heme biosynthesis (ALAS2, FECH, and SLC4A) (Table S6) were upregulated. Among the enriched GOBPs, Fe ^{+ +} homeostasis, in particular, was prominently featured by the upregulation of key iron regulating genes (SLC40A1, FTH1) (Table S7). FTH1 encodes H-ferritin, which is universal protein for intracellular storage, distribution of iron, and important inflammatory markers [32]. Among other upregulated GOBP pathways, associated with DMTN, were fibroblast migration, wound healing, and wound healing regulation (Figure 6A,C; Tables S6 and S7). The details of all DMTN implicated pathways are shown in Table S7. Similarly, gene set enrichment analysis (GSEA) reported that PIWIL2 is associated with significantly downregulated biological pathways: Cytoplasmic translation and regulation of mRNA metabolism (Figure 6C).

3. Discussion

This study was designed to quantify the genetic connection between COVID-19 and CAD, and the shared genetic architecture and their associated cellular mechanisms linking both traits. In this study, we provide strong evidence for a shared genetic susceptibility between COVID-19 and coronary artery disease (CAD). We observed significant positive genetic correlations both at the whole-genome-wide level and locally at the locus level. One robust aspect of this study is the consistency of the genetic correlation across different severity and European and Southeast Asian (including Lahore Punjabi) populations. Both moderate and severe COVID-19 populations showed an increased susceptibility to CAD. This increased susceptibility has been observed in both European (EUR) and South Asian (SAS) ancestries (Table 1). The finding that the Lahore Punjabi population in Pakistan shared these genetic risk markers suggests that the link between COVID-19 and CAD is not population-specific but rather a global biological phenomenon. While the magnitude of association was low, the level of statistical significance across all three groups reinforces the validity of the genetic overlap due to enough power behind the large genome. The observed global genetic correlation between CAD and COVID-19 was low, at a range of 0.10–0.15, though statistically significant. The large sample size provided sufficient statistical power to detect even nominal local relationships. A recent review found that many patients who recovered from COVID-19 continued to experience complications, including CHD, even without detectable viral infection [29]. The clinical implications confirm individuals at “double risk” for both severe viral outcomes and heart disease. Although a few studies have demonstrated that COVID-19 significantly increases the risk of CAD and major adverse cardiac events for up to two to three years following the acute infection [4,33,34], this study did not include a longitudinal post-COVID cohort, incident CAD follow-up, adjusted hazard ratios, confidence intervals, or epidemiologic control for confounding and surveillance bias. Systemic inflammation and endothelial dysfunction serve as the shared pathophysiological mechanisms underlying these outcomes [11,12,13]. These results align with previous studies, with minor variations likely due to differences in methodology and datasets [14,35]. In this study, we identified 24 coincident risk loci for COVID-19 and CAD (Table S1). Our results are consistent with Guo et al., where the PLEIO approach for multiple-trait analysis was used to detect 10 shared loci between COVID-19 and CAD [36,37]. The use of HyPrColoc and fine mapping (SuSiE) allowed us to prioritize specific genetic signals that bridge the two disorders. Many genomic regions in colocalization show mere physical “coincidence”, harboring separate causal variants for each disease. This suggests that while these genomic linkages are important to both diseases, the functional effects are largely independent (Table S1). This study identified 3 critical pleiotropic loci at 1p31.1, 8p21.3, and 18q11.2 where specific genetic variants drive risk for both conditions simultaneously (Table 2; Figure 2; Tables S2 and S4 and Figure S4A–D). We found high confidence evidence (PP_HyPrColoc > 0.75) that single causal variants influence both traits (Table S2), as described in this study [38]. These represent the most direct biological links between acute viral severity and chronic cardiovascular disease. The high linkage disequilibrium at locus 1p31.1 suggests the variants are inherited together, creating a unified risk block for moderate to severe COVID-19 and CAD (Table 2; Figure 2; Figure S4A–D, Table S4). Pietzner et al. used HyPrColoc to identify the genetic architecture of host proteins involved in SARS-CoV-2 infection [39].

The 8p21.3 locus showed a strong negative genetic correlation across both European and South Asian ancestries. This suggests that while the locus is shared, the genetic directionality of risk at this specific site may act differently across phenotypes. The 18q11.2 locus exhibited the strongest correlation in South Asian ancestry (Table 1). Positional candidate gene GATA6 (Table 4) is a transcription factor vital for heart and lung repair and is a potential shared driver of both severe pneumonia and coronary damage. Previous studies have demonstrated the involvement of GATA6 in both disorders [40,41].

Our Mendelian Randomization analysis provides critical insights into the causal relationship between COVID-19 and CAD. The forward Mendelian Randomization analysis established that a genetic predisposition to CAD significantly increases the risk of COVID-19 at three specific loci (Figure 3; Table 3). This finding suggests that the physiological state or pathways associated with CAD such as chronic inflammation or vascular fragility may create a primed environment for severe viral infection. The reverse Mendelian Randomization analysis explored whether the genetic signals for COVID-19 contribute to the development of CAD (Figure 3; Table 3). The 1p31.1 locus showed no significant reverse causality, suggesting that while CAD can influence COVID-19, the reverse is not true. These findings provide a genetic explanation for the clinical observation that heart disease and COVID-19 are inextricably linked. The bidirectional causality at 18q11.2 and 8p21.3 suggests a feedback loop where cardiovascular vulnerability worsens viral infection, and the resulting viral infection, in turn, accelerates CAD. These results highlight the necessity of long-term cardiovascular monitoring for COVID-19 survivors, particularly those with a genetic predisposition involving these specific loci. Epidemiological and experimental evidence indicate COVID-19 infection significantly increases the risk of major adverse cardiac and thrombotic events [5,6,7,8,9,10]. Conversely, pre-existing obesity, heart failure, and ischemic heart disease are major risk factors for increased susceptibility, severity of coronavirus infection, and the development of long COVID-19 syndrome [4].

The integration of the mBAT-Combo framework with transcriptomic data from peripheral blood mononuclear cells (PBMCs) prioritized a list of candidate genes that likely drive the shared pathology of COVID-19 and CAD (Figure 4; Table 4 and Table 5, Table S5). Four genes were identified at 1p31.1 (rs7515509) locus, including AK5, PIGK, USP33, and ZZZ3. The association of PIGK is particularly notable as it was validated through expression profiling (Table S5). At the 8p21.3 locus (rs8192330), 18 candidate genes were identified, including DMTN, PIWIL2, FHIP2B, DOK2, and SFTPC (Figure 5; Table 5; Table S5). The high number of genes at this site suggests a complex regulatory environment where multiple biological pathways may be affected. For the 18q11.2 locus (rs4800403), the transcription factors GATA6 and CTAGE1 were prioritized (Table 4). Given GATA6’s role in cardiovascular and pulmonary development, it remains a primary candidate for mediating tissue repair in both diseases. Our spatial findings align with previous studies; for example, variants associated with diabetes and obesity include 86 intronic variants of FTO [42], with the lead SNP (rs9930506) showing spatial associations that affect IRX3 gene regulation. Similarly, SNPs (rs1297265, rs1736020, and rs2823286) in an intergenic region of chromosome 21 are associated with the distant NRIP1 gene rather than the closer USP25 gene [43].

The integrated analysis of transcriptomic profiling and gene set enrichment revealed how those shared variants drive the systemic damage observed in both severe COVID-19 and CAD. The functional analysis confirms that DMTN and PIWIL2 are not just coincidentally located near risk variants (Table 4 and Table 5, Table S5); they are actively dysregulated during COVID-19 infection. The shared genetic architecture between CAD and COVID-19 likely manifests as a dual burden: a predisposition to iron dysregulation and oxidative stress (DMTN) and a compromised pulmonary/vascular defense (PIWIL2). These findings provide a biological roadmap for understanding why some patients are genetically “primed” for both severe viral outcomes and long-term coronary damage. Defects in iron homeostasis, dysregulated erythropoiesis and immune dysfunction due to COVID-19 possibly contribute to inefficient oxygen transport, inflammatory disequilibrium and persisting symptomatology, and therapeutically tractable [44]. The pathological role of iron and associated oxidative stress equally play role in CAD [45,46,47,48]. Previous studies indicated that SARS-CoV-2 severely disrupts mRNA metabolism, including tRNA aminoacylation and miRNA pathways, which facilitates the aberrant immune response seen in critical patients. By manipulating host machinery for replication, the virus induces high-level cytokine production and impairs interferon signaling. This leads to significant metabolic shifts, such as mitochondrial dysfunction in immune cells. Furthermore, hypoaccurate translation within the mitochondria results in increased oxidative stress, which is sufficient to influence protein synthesis. This process of translation remodeling can be mediated by reversible changes in the redox state of protein synthesis components susceptible to oxidation. In the broader context of cardiovascular pathology, such as coronary artery disease (CAD), cytoplasmic translation plays a heavy role in the cellular response to ischemia, stress, and remodeling [49,50,51,52,53,54,55].

4. Materials and Method

4.1. Datasets and Study Populations

Meta-analysis GWAS summary statistics for COVID-19 were obtained from the COVID-19 Host Genetic Consortium (HGI) (https://www.covid19hg.org/, accessed on 28 April 2026) release round 7, focusing on three predefined phenotypes: (1) Critically ill cases (18152 subjects requiring respiratory support in hospital or who died), (2) Moderate to severe cases (44,986 hospitalized subjects), and (3) SARS-CoV-2 reported cases (159,840 cases and over 6 million controls across 64 studies) [25,26,56,57,58]. The first two categories represent the severity of COVID-19, while the third reflects COVID-19 susceptibility [59].

Corresponding GWAS summary datasets for CAD were obtained from the Coronary ARtery DIsease Genome Wide Replication and Meta-analysis plus The Coronary Artery Disease Genetics consortium (CARDIoGRAMplusC4D) (https://www.cardiogramplusc4d.org/, accessed on 28 April 2026), encompassing 22,233 cases and 64,762 controls in CARDIoGRAM and 15,420 CAD cases and 15,062 controls in C4D GWAS [60]. CAD participants were predominantly of European (EUR) ancestry, while the COVID-19 subjects were of different genetic ancestries, including admixed American, African, East Asian, European, Middle Eastern, and South Asian participants. To prioritize shared causal variants for both traits, we performed fine mapping utilizing LD matrices derived primarily from European ancestry. To improve the resolution of these signals and account for global genetic diversity, we included a South Asian population, specifically the Lahore Punjabi (PJL) population in Pakistan. RNA-seq expression profiling (GSE202805) of peripheral blood mononuclear cells (PBMCs) from COVID-19 patients was retrieved through the Gene Expression Omnibus repository [61]. These data were generated using the Illumina HiSeq 4000 high-throughput sequencing platform. The cohort includes healthy (n = 10), acute-mild (n = 4), acute-moderate (n = 6), acute-severe (n = 32), and convalescent (n= 28) subjects. We compared “COVID-19” samples (n = 42) versus “healthy” (n = 38) as described in Section 4.3.

4.2. Post-GWAS Analyses

We conducted post-GWAS analyses for both diseases, as outlined in Figure S1. The obtained COVID-19 dataset included 14,415,897 distinct single-nucleotide polymorphisms (SNPs) from three predefined categories, and the CAD dataset included 20,853,377 unique SNPs from GWAS summary datasets for CAD. We cross-referenced these datasets to identify shared biallelic SNPs. Rare variants, defined as those reported to have allele frequencies of less than 1% in either GWAS, were excluded. This resulted in a set of 7685,407 identical SNPs for both traits referred to as the GWAS-curated datasets. For high-coverage whole-genome sequencing samples from the 1000 Genomes Project (1KGP) resource, we applied standard data quality control procedure using PLINK2 [62,63,64]. Quality control was performed by applying the following filters: missing rate per variant (–geno 0.2), missing rate per individual (–mind 0.2), minor allele frequency (–maf 0.01), and Hardy–Weinberg equilibrium (–hwe midp). The quality controlled 1KGP genotypic data were used to extract biallelic European and South Asian samples, including the Lahore Punjabi population in Pakistan and named it 1KGP-curated data, which served as a reference panel for constructing LD matrices for respective populations.

4.2.1. Evaluation of Global Genetic Connections Between COVID-19 and CAD

We assessed the genetic correlation and covariance across the genome between three predefined COVID-19 phenotypes and CAD using the LD score regression model in the R implementation of ldsc software (version 1.0.1) [65,66]. In the model, we estimated LD scores for SNPS present in both GWAS-curated datasets in a customized way using LD matrices on the EUR and SAS samples (1KGP-curated data), with a window size of 1 centimorgan (cM). The computed LD scores were, then, used to predict the genome-wide genetic associations between CAD and three COVID-19 phenotypes from HGI.

4.2.2. Functional Genomic Coordinates or Genomic Risk Loci

We defined functional genomic regions, or risk loci, using index SNPs in the GWAS-curated datasets (4.2.0 Post-GWAS analyses) by PLINK (https://www.cog-genomics.org/plink/2.0/, accessed on 28 April 2026) v.2 [64,67]. The genome-wide significant SNPs (p < 5 × 10⁻⁸) were clumped at an r² threshold ( < 0.1) to identify independent lead signals within each risk locus. We assigned a window (

\pm

250 kb) centered on each lead signal to serve as the functional block for each locus [68]. Biallelic variants grouped with lead signals at pairwise r² (0.1 ≤ r² < 0.6), or SNPs within the window, were assigned a p-value threshold of ≤ 0.1 to ensure robust results and minimize noise in downstream analyses such as colocalization and fine mapping. The lead signal with the smallest p-value was designated as the top signal, while all others within the locus were considered secondary signals. We used a GRCh38 1KGP reference panel to establish risk loci through pairwise r². When multiple significant signals were present in a genomic interval, we separated them using haploReg LD data with the haploR v.4.2 package to determine if the region harbors more than one causal or independent variants [69,70].

4.2.3. Trait–Trait Colocalization

Finding shared genetic links (overlap and pleiotropy) between COVID-19 and CAD was achieved through the horizontal integration of summary statistics from two separate GWAS. We used COVID-19 and CAD GWAS-curated datasets for colocalization and employed three models for defining genetic colocalization, namely, Approximate Bayes Factor (ABF) from Coloc v.5.2.3 [31], Hypothesis Prioritization for multi-trait Colocalization from HyPrColoc v.0.0.2 [71], and comparison of locus plots from LocusCompareR v.1.0.0 [72].

ABF identifies either (1) the regions having pleiotropic effect on both traits through single common casual or (2) common regions shared by both traits for their causality but with distinct casual variants based on combined posterior probability [PP(H₃ + H₄)] with a high degree of confidence (PP > 0.7). This indicates that COVID-19 and CAD share genomic regions on chromosomes, either with distinct causal variants [H₃] or with common causal signals [H₄], with high confidence. Specifically, H₃ indicates that both traits are associated with distinct causal variants within the same region, whereas H₄ suggests a single shared causal variant. We assigned genetic signals in each risk locus from the curated GWAS data with default prior parameters P₁ = P₂ = 1 × 10⁻⁴ for the prior probability that a SNP was causally associated with trait 1 or trait 2. The prior probability,

P_{12}

, that a SNP is associated with both traits was set to be

5 \times

[31,73]. To identify the colocalized loci containing a single causative SNP with a pleiotropic effect on both traits, we implemented the R instance of HyPrColoc v.0.0.2 [71], which is a Bayesian divisive clustering algorithm using GWAS-curated datasets as input. The loci were considered pleiotropic if PP_HyPrColoc ≥ 0.75, as indicated by Fotios Koskeridis et. al. [38]. Genetic loci that met the specified colocalization criteria [PP_HyPrColoc ≥ 0.75 and PP(H₃ + H₄) ≥ 0.7] were further visually inspected using LocusCompare plots. A Venn diagram was used to summarize the results of different colocalization methods. The shared loci identified from colocalization have 99% chances to have a (individual or genetically linked) causal signals for the loci and we treated them as 99% credible set of consensus loci for overlapping casual signals. A total of 99% credible sets were derived from coloc.abf for colocalized loci at a predefined rule of posterior P(H4) > 0.7.

4.2.4. Evaluation of Local Genetic Connections Between COVID-19 and CAD

To assess the genetic correlation and covariance of the genomic regions or the loci identified from Section 4.2.3. Trait–Trait Colocalization, the COVID-19 phenotype showing the highest correlation was recruited along with CAD. The LD score regression as described in Section 4.2.1. was performed for the colocalized loci. We estimated LD scores for each colocalized loci using LD matrices on EUR and SAS samples, with a window size of 1 centimorgan (cM) and applied in the model to predict the genetic correlation at the locus level between COVID-19 and CAD.

4.2.5. Fine Mapping and Prioritizing Pleotropic Variants

Fine mapping aims to detect joint-association signals for causal inference, where the strength of joint association is assessed using the posterior inclusion probability (PIP). We performed fine mapping using the Coloc R package, which incorporates the Sum of Single Effects (SuSiE) model [74,75] to more accurately pinpoint potential causal variants [76,77]. In the analysis, the colocalized loci from trait–trait colocalization were analyzed for the shared casual signals utilizing meta-analysis summary data from CAD along with three COVID-19 phenotypes. LD matrices required in the analysis were constructed from European and South Asian 1KGP-curated data using PLINK2 [64,67]. The posterior probability (PPH₄) from the SuSiE model in the Coloc served as posterior probability for colocalization in SNP-level metric called posterior inclusion probability, which shows that a specific individual overlapping signal is the causal one for each of the three loci, and we considered a high degree of confidence (PPH_4(abf) ≥ 0.7). For the proxy variants where the lead SNPs were fine mapped with the proxy variants from the respective loci, we considered the identified lead SNPs as defined in Section 4.2.2. Functional Genomic Coordinates or Genomic Risk Loci from each locus as pleotropic for both traits, by analyzing the linkage disequilibrium (LD) patterns of colocalized proxy variants (secondary variants) with the lead one in both ancestries. The SuSiE model requires an estimate of the number of independent causal signals within a locus as an input parameter. Within each identified genomic region, we fitted the Coloc based SuSiE summary statistics model with 100 iterations, setting the expected number of pleotropic signals to match the number identified loci.

4.2.6. Bidirectional Mendelian Randomization (MR)

We applied Generalized Summary data-based Mendelian randomization (GSMR) to assess the causal relationship between COVID-19 and CAD using publicly available GWAS summary data [78,79]. Lead pleiotropic SNPs associated with either COVID-19 or CAD (p < 5 × 10^–8), along with independent signals (p < 5 × 10^–3 and r² < 0.05), served as instrumental variables (IVs) in forward and reverse MR analyses, respectively. Colocalized loci underwent LD pruning (r² < 0.05 between lead and independent signals; p < 5 × 10^–3) within a 500 kb window using PLINK2, keeping 1KGP-curated data as a reference [64,67]. In bidirectional analysis using GSMR2 v. v1.1. [78,79], we applied a GWAS p-value threshold (p < 5 × 10^–3) to select SNPs, an False Discovery Rate (FDR) threshold (≤ 0.05) to exclude chance correlations, and a multi-SNP-based Heterogeneity in Dependent Instruments (HEIDI)-outlier threshold (p < 0.01) to remove SNPs that influence both the exposure and the outcome traits through independent pathways in the global HEIDI-outlier method [80]. We also conducted univariate MR analyses for each instrumental variable and plotted the association strength against the causal estimate (

\hat{β}

) for all exposure-outcome pairs.

4.2.7. Gene-Level Analysis

mBAT-combo is a novel set-based statistical test designed for gene-based association analysis in post-GWAS research that offers improved power over existing methods [81]. Unlike traditional single-SNP analyses that test one genetic marker at a time, mBAT-combo aggregates the association signals of multiple SNPs within predefined sets, typically defined by gene boundaries. This approach combines Multivariate set-Based Association Test (mBAT) and Fast set-Based Association Test (fastBAT) statistics using a Cauchy combination method. We utilized mBAT-combo v.0.0.0.9 [81] to prioritize the genes most likely involved in the shared risks of COVID-19 and CAD for each pleiotropic locus identified through trait–trait colocalization. We used GWAS-curated datasets for COVID-19 and CAD to prioritize the genes and utilized the mixed ancestry linkage disequilibrium reference panel comprised 3202 samples from the 1KGP. We included protein-coding genes with unique Ensembl gene IDs from GENCODE (v.40) for hg38 (https://www.gencodegenes.org/) to set boundaries. Gene regions were defined as spanning 50 kb upstream and downstream of each gene’s untranslated regions. Predicted genes through mBAT-Combo for both traits were visualized in plots using LocusZoom v.0.3.8 [82] and LDlinkR v.1.4.0 [83] to inspect the genetic architecture, including the recombination rates and linkage disequilibrium patterns among pleiotropic variants.

4.3. RNA-Seq Analysis and Candidate Gene Prioritization

In the analysis, RNA-seq expression profiles from 80 peripheral blood mononuclear cells (PBMCs) generated from the Illumina HiSeq 4000 platform were retrieved from the Gene Expression Omnibus (GEO accession #: GSE202805) [61]. The RNA-seq comparison groups were defined in the sample metadata provided in the GEO dataset. Specifically, we combined all acute COVID-19 categories—acute-mild (n = 4), acute-moderate (n = 6), and acute-severe (n = 32)—into a single group, referred to as “COVID-19” (total n = 42) and the convalescent (n= 28) into another referred to as “healthy” (total n = 38) to establish 1:1 ration between two groups. Differential expressions were performed, using the R instance of Deseq2 v.1.48.2 [84], by comparing COVID-19 (n = 42) with healthy (n = 38). The raw, non-normalized gene count matrices downloaded directly from the GEO dataset were used for the analysis. In the analysis, the differentially expressed genes were, then, compared and cross-matched with genes prioritized from gene-level analysis, which validated candidate genes identified in the preceding gene-level association. DESeqDataSet object was created from the count matrices and conducted normalization, dispersion estimation, and negative binomial modeling procedures [84]. In the procedures, we specified the reference level in the model and pre-filtered genes, retaining only those with at least five counts in a minimum of five samples and genes with low expression across all samples were excluded. The filtered dataset, then, was normalized, estimated for dispersions (a critical step when dealing with relatively small sample sizes), and fitted to the negative binomial models as implemented in DESeq2. We applied variance stabilizing transformation (VST) [85,86,87] with the default blind setting for visualization and other multivariate analyses where raw count data were inappropriate. Log-transformed, normalized expression data of the 500 most variable genes were used for unsupervised principal component analysis (PCA) to identify unintended sources of variation, such as covariates and batch effects and the samples from both groups (COVID-19 and healthy) intermixing the clustering in PCA were excluded in the latter analysis. To minimize noise from low-count genes and improve the quality of gene ranking and visualization in an MA plot, the apeglm package v.1.32.0 for log2 fold change (log2FC) shrinkage instead of the default DESeq2 “normal” estimator was used for analysis [88]. Genes with log2FC ≥ 0.5 and Benjamini–Hochberg (FDR) adjusted p < 0.05 were considered significant in expression between groups. These differentially expressed genes were compared with the likely causal genes identified from genome-wide association analyses and the top candidates were visualized using Volcano plots (ggplot2 v.3.5.2) [89]. We considered top candidate genes for gene set enrichment analysis (FGSEA) to predict the altered pathways in COVID-19 patients. The top genes and associated DESeq2 significant signatures from gene set enrichment analysis were exhibited in hierarchical clustering (pheatmap v.1.0.13) [90]. After performing hierarchical clustering of the samples using pheatmap with different distance metrics.

4.4. Gene Set Enrichment Analysis and Identification of Altered Pathways from COVID-19 Patients

Gene set enrichment analysis (GSEA) using GSEA function of the clusterProfiler R package (version 4.18.4) [91,92,93,94,95] was implemented using the ranked gene list derived from the DESeq2 results, where genes were ordered based on the stat column. We used the default gene set size parameters in GSEA, with a maximum gene set size of 500. For pathway annotations, gene sets were obtained from Molecular Signatures Database (MSigDB) [96] using the msigdbr R package (version 26.1.0) [97,98]. Multiple testing correction for pathway enrichment was performed using the Benjamini–Hochberg method [99], and adjusted p-values (FDR) were used to determine the significance. The altered biological pathways underlying differentially expressed top candidate genes in COVID-19 patients by referencing three genes sets from MSigDB. These gene sets are hallmark gene sets (H) [97], Ontology gene sets (C5) that include gene ontology (GO) [100], and human phenotype ontology (HPO) [101]. The significantly altered pathways from GSEA for top candidate genes were visualized using the clusterProfiler v. 4.18.4 [91,93,94,95].

5. Conclusions

In conclusion, we conducted post-GWAS analysis to identify the shared molecular signatures and their mechanisms linking both traits. This study identifies a shared genetic susceptibility between COVID-19 and coronary artery disease (CAD) across different populations, revealing a “double risk” driven by shared mechanisms such as inflammation and endothelial dysfunction. The analysis identifies 24 shared risk loci and 3 critical pleiotropic loci (1p31.1, 8p21.3, 18q11.2) where single variants drive risk for both conditions, driven by candidate genes like GATA6 that mediate cardiovascular and pulmonary damage. The shared genetic architecture between CAD and COVID-19 likely manifests as a dual burden: a predisposition to iron dysregulation and oxidative stress (DMTN) and a compromised pulmonary/vascular defense (PIWIL2). These findings provide a biological roadmap for understanding why some patients are genetically “primed” for both severe viral outcomes and long-term coronary damage.

Supplementary Materials

The following supplementary materials can be downloaded at: https://www.mdpi.com/article/10.3390/ijms27094132/s1.

Author Contributions

M.S.A. conceived and designed the project, conducted the research, performed data analysis, and wrote the manuscript. W.H. and W.S. supervised the research and provided valuable constructive feedback. S.A. participated in data curation, methodology, and manuscript preparation. A.M. (Anwaruddin Mohammad) helped data analysis at High-Performance Computing in the Bioinformatics Core. A.M. (Ani Manichaikul) helped review the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Higher Education Commission (HEC) of Pakistan, grant No: 1-8/HEC/HRD/2023/13078 (PIN: IRSIP 53 BMS 27), dated 2 June 2023 under the International Research Support Initiative Program (IRSIP).

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Review Board of COMSATS University Islamabad, PAKISTAN (CUI/Bio/ERB/2023/12 and 5 January 2023)” for studies involving humans.

Informed Consent Statement

Not applicable.

Data Availability Statement

Summary statistics generated by the COVID-19 HGI are publicly available on website (https://www.covid19hg.org/results/r7/, accessed on 28 April 2026), including per-ancestry summary statistics for African, admixed American, East Asian, European, and South Asian ancestries. CAD Summary statistics are available through the CARDIoGRAMplusC4D website (http://www.cardiogramplusc4d.org/, accessed on 28 April 2026) and the NHGRI-EBI GWAS catalog (https://www.ebi.ac.uk/gwas/, accessed on 28 April 2026), accession codes: GCST90132314. Expression profiling of PBMC data (GSE202805) derived from COVID-19 acute-severe patients (https://www.ncbi.nlm.nih.gov/geo/, accessed on 28 April 2026) and the 1000 Genomes Project (https://www.genome.gov/27528684/1000-genomes-project, accessed on 28 April 2026). The code for summary statistics, colocalization, and fine mapping (https://github.com/chr1swallace/coloc, accessed on 28 April 2026), HyPrColoc (https://github.com/cnfoley/hyprcoloc, accessed on 28 April 2026), and for PLINK (https://github.com/insilico/plink, accessed on 28 April 2026). Locuscomparer (https://github.com/boxiangliu/locuscomparer, accessed on 28 April 2026), locuszoomr (https://github.com/myles-lewis/locuszoomr, accessed on 28 April 2026), Mendelian randomization (GSMR2) (https://github.com/JianYang-Lab/gsmr2, accessed on 28 April 2026), DESeq2 (https://bioconductor.org/packages/release/bioc/html/DESeq2.html, accessed on 28 April 2026), and annotation tools (https://useast.ensembl.org/info/docs/tools/vep/index.html, accessed on 28 April 2026) are available on GitHub and in the R computing language (https://www.r-project.org/, https://github.com/rstudio/rstudio, accessed on 28 April 2026).

Acknowledgments

I am sincerely grateful to Weibin Shi for his supervision, laboratory facilities, and institutional support. “This work was conducted in Shi’s laboratory in the Department of Radiology and Medical Imaging, Sheridan G. Snyder Translational Research Building at Fontaine Research Park, and partially used High-Performance Computing (HPC) (https://www.rc.virginia.edu/, accessed on 28 April 2026) in the Bioinformatics Core, which is supported by the University of Virginia, School of Medicine, Research Resource Identifiers (RRID):SCR_012718”.

Conflicts of Interest

The authors declare no conflicts of interest and the funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

ABF	Approximate Bayes factors
ACS	Acute coronary syndrome
C2	Curated gene sets
C5	Ontology gene sets
CGP	Chemical and genetic
CPs	Canonical pathways
CAD	Coronary artery disease
CGPs	Chemical and genetic perturbations
EUR	European
fastBAT	Fast and flexible set-Based Association Test
FDR	False Discovery Rate
GSEA	Gene Set Enrichment Analysis
GO	Gene ontology
GWAS	Genome-Wide Association Study
GSMR	Mendelian randomization analysis
H	Hallmark gene set
HEIDI	Heterogeneity in dependent instruments
HGI	Host Genetic Consortium
HPC	High-performance computing
HPO	Human phenotype ontology
HyPrColoc	Hypothesis Prioritisation for multi-trait Colocalization
1KGP	1000 Genomes Project
LD	Linkage disequilibrium
LDL	Low-density lipoprotein
log2FC	Log2 fold change
mBAT	Multivariate set-based association test
MSigDB	Molecular signatures database
PP	Posterior probability
PBMC	Peripheral blood mononuclear cell
SNP	Single Nucleotide Polymorphism
SARS-CoV-2	Severe acute respiratory syndrome coronavirus 2
SAS	South Asian
VST	Variance stabilizing transformation

References

Kim, S.Y.; Yeniova, A.Ö. Global, Regional, and National Incidence and Mortality of COVID-19 in 237 Countries and Territories, January 2022: A Systematic Analysis for World Health Organization COVID-19 Dashboard. Life Cycle 2022, 2, e10. [Google Scholar] [CrossRef]
Davis, H.E.; McCorkell, L.; Vogel, J.M.; Topol, E.J. Long COVID: Major Findings, Mechanisms and Recommendations. Nat. Rev. Microbiol. 2023, 21, 133–146, Correction in Nat. Rev. Microbiol. 2023, 21, 408. https://doi.org/10.1038/s41579-023-00896-0. [Google Scholar] [CrossRef] [PubMed]
Alrajhi, N.N. Post-COVID-19 Pulmonary Fibrosis: An Ongoing Concern. Ann. Thorac. Med. 2023, 18, 173–181. [Google Scholar] [CrossRef] [PubMed]
Tsampasian, V.; Bäck, M.; Bernardi, M.; Cavarretta, E.; Dȩbski, M.; Gati, S.; Hansen, D.; Kränkel, N.; Koskinas, K.C.; Niebauer, J.; et al. Cardiovascular Disease as Part of Long COVID: A Systematic Review. Eur. J. Prev. Cardiol. 2025, 32, 485–498. [Google Scholar] [CrossRef]
Knight, R.; Walker, V.; Ip, S.; Cooper, J.A.; Bolton, T.; Keene, S.; Denholm, R.; Akbari, A.; Abbasizanjani, H.; Torabi, F.; et al. Association of COVID-19 With Major Arterial and Venous Thrombotic Diseases: A Population-Wide Cohort Study of 48 Million Adults in England and Wales. Circulation 2022, 146, 892–906. [Google Scholar] [CrossRef]
Xie, Y.; Xu, E.; Bowe, B.; Al-Aly, Z. Long-Term Cardiovascular Outcomes of COVID-19. Nat. Med. 2022, 28, 583–590. [Google Scholar] [CrossRef] [PubMed]
Nappi, F.; Nappi, P.; Gambardella, I.; Avtaar Singh, S.S. Thromboembolic Disease and Cardiac Thrombotic Complication in COVID-19: A Systematic Review. Metabolites 2022, 12, 889. [Google Scholar] [CrossRef]
Giacca, M.; Shah, A.M. The Pathological Maelstrom of COVID-19 and Cardiovascular Disease. Nat. Cardiovasc. Res. 2022, 1, 200–210. [Google Scholar] [CrossRef]
Hilser, J.R.; Spencer, N.J.; Afshari, K.; Gilliland, F.D.; Hu, H.; Deb, A.; Lusis, A.J.; Wilson Tang, W.H.; Hartiala, J.A.; Hazen, S.L.; et al. COVID-19 Is a Coronary Artery Disease Risk Equivalent and Exhibits a Genetic Interaction with ABO Blood Type. Arterioscler. Thromb. Vasc. Biol. 2024, 44, 2321–2333. [Google Scholar] [CrossRef]
Guan, W.; Ni, Z.; Hu, Y.; Liang, W.; Ou, C.; He, J.; Liu, L.; Shan, H.; Lei, C.; Hui, D.S.C.; et al. Clinical Characteristics of Coronavirus Disease 2019 in China. N. Engl. J. Med. 2020, 382, 1708–1720. [Google Scholar] [CrossRef]
Huang, S.W.; Wang, S.F. Sars-Cov-2 Entry Related Viral and Host Genetic Variations: Implications on Covid-19 Severity, Immune Escape, and Infectivity. Int. J. Mol. Sci. 2021, 22, 3060. [Google Scholar] [CrossRef]
Xu, S.W.; Ilyas, I.; Weng, J.P. Endothelial Dysfunction in COVID-19: An Overview of Evidence, Biomarkers, Mechanisms and Potential Therapies. Acta Pharmacol. Sin. 2023, 44, 695–709. [Google Scholar] [CrossRef]
Wu, X.; Xiang, M.; Jing, H.; Wang, C.; Novakovic, V.A.; Shi, J. Damage to Endothelial Barriers and Its Contribution to Long COVID. Angiogenesis 2024, 27, 5–22. [Google Scholar] [CrossRef]
Hasanvand, A. COVID-19 and the Role of Cytokines in This Disease. Inflammopharmacology 2022, 30, 789–798. [Google Scholar] [CrossRef]
Zanza, C.; Romenskaya, T.; Manetti, A.C.; Franceschi, F.; La Russa, R.; Bertozzi, G.; Maiese, A.; Savioli, G.; Volonnino, G.; Longhitano, Y. Cytokine Storm in COVID-19: Immunopathogenesis and Therapy. Medicina 2022, 58, 144. [Google Scholar] [CrossRef]
Ghaffarpour, S.; Ghazanfari, T.; Ardestani, S.K.; Naghizadeh, M.M.; Vaez Mahdavi, M.R.; Salehi, M.; Majd, A.M.M.; Rashidi, A.; Chenary, M.R.; Mostafazadeh, A.; et al. Cytokine Profiles Dynamics in COVID-19 Patients: A Longitudinal Analysis of Disease Severity and Outcomes. Sci. Rep. 2025, 15, 14209. [Google Scholar] [CrossRef] [PubMed]
Hottz, E.D.; Bozza, P.T. Platelet-Leukocyte Interactions in COVID-19: Contributions to Hypercoagulability, Inflammation, and Disease Severity. Res. Pract. Thromb. Haemost. 2022, 6, e12709. [Google Scholar] [CrossRef]
Sciaudone, A.; Corkrey, H.; Humphries, F.; Koupenova, M. Platelets and SARS-CoV-2 during COVID-19: Immunity, Thrombosis, and Beyond. Circ. Res. 2023, 132, 1272–1289. [Google Scholar] [CrossRef]
Ghasemzadeh, M.; Ahmadi, J.; Hosseini, E. Platelet-Leukocyte Crosstalk in COVID-19: How Might the Reciprocal Links between Thrombotic Events and Inflammatory State Affect Treatment Strategies and Disease Prognosis? Thromb. Res. 2022, 213, 179–194. [Google Scholar] [CrossRef] [PubMed]
Durrington, P. Blood Lipids after COVID-19 Infection. Lancet Diabetes Endocrinol. 2023, 11, 68–69. [Google Scholar] [CrossRef] [PubMed]
Ochoa-Ramírez, L.A.; De la Herrán Arita, A.K.; Sanchez-Zazueta, J.G.; Ríos-Burgueño, E.; Murillo-Llanes, J.; De Jesús-González, L.A.; Farfan-Morales, C.N.; Cordero-Rivera, C.D.; del Ángel, R.M.; Romero-Utrilla, A.; et al. Association between Lipid Profile and Clinical Outcomes in COVID-19 Patients. Sci. Rep. 2024, 14, 12139. [Google Scholar] [CrossRef]
Bisher, M.; Thamer, A.; Shahata, S.; Atia, A. Elevated Blood Glucose in COVID-19 Patients: An Explorative Study. medRxiv 2024. [Google Scholar] [CrossRef]
Goel, V.; Raizada, A.; Aggarwal, A.; Madhu, S.V.; Kar, R.; Agrawal, A.; Mahla, V.; Goel, A. Long-Term Persistence of COVID-Induced Hyperglycemia: A Cohort Study. Am. J. Trop. Med. Hyg. 2024, 110, 512–517. [Google Scholar] [CrossRef]
Wieczfinska, J.; Kleniewska, P.; Pawliczak, R. Oxidative Stress-Related Mechanisms in SARS-CoV-2 Infections. Oxid. Med. Cell. Longev. 2022, 2022, 5589089. [Google Scholar] [CrossRef] [PubMed]
Pathak, G.A.; Karjalainen, J.; Stevens, C.; Neale, B.M.; Daly, M.; Ganna, A.; Andrews, S.J.; Kanai, M.; Cordioli, M.; Polimanti, R.; et al. A First Update on Mapping the Human Genetic Architecture of COVID-19. Nature 2022, 608, E1–E10. [Google Scholar] [CrossRef]
Kanai, M.; Andrews, S.J.; Cordioli, M.; Stevens, C.; Neale, B.M.; Daly, M.; Ganna, A.; Pathak, G.A.; Iwasaki, A.; Karjalainen, J.; et al. A Second Update on Mapping the Human Genetic Architecture of COVID-19. Nature 2023, 621, E7–E26. [Google Scholar] [CrossRef]
Hartiala, J.A.; Han, Y.; Jia, Q.; Hilser, J.R.; Huang, P.; Gukasyan, J.; Schwartzman, W.S.; Cai, Z.; Biswas, S.; Trégouët, D.A.; et al. Genome-Wide Analysis Identifies Novel Susceptibility Loci for Myocardial Infarction. Eur. Heart J. 2021, 42, 919–933. [Google Scholar] [CrossRef] [PubMed]
Reilly, M.P.; Li, M.; He, J.; Ferguson, J.F.; Stylianou, I.M.; Mehta, N.N.; Burnett, M.S.; Devaney, J.M.; Knouff, C.W.; Thompson, J.R.; et al. Identification of ADAMTS7 as a Novel Locus for Coronary Atherosclerosis and Association of ABO with Myocardial Infarction in the Presence of Coronary Atherosclerosis: Two Genome-Wide Association Studies. Lancet 2011, 377, 383–392. [Google Scholar] [CrossRef]
Wang, S.; Peng, H.; Chen, F.; Liu, C.; Zheng, Q.; Wang, M.; Wang, J.; Yu, H.; Xue, E.; Chen, X.; et al. Identification of Genetic Loci Jointly Influencing COVID-19 and Coronary Heart Diseases. Hum. Genom. 2023, 17, 101. [Google Scholar] [CrossRef]
Wen, Y.P.; Yu, Z.G. Identifying Shared Genetic Loci and Common Risk Genes of Rheumatoid Arthritis Associated with Three Autoimmune Diseases Based on Large-Scale Cross-Trait Genome-Wide Association Studies. Front. Immunol. 2023, 14, 1160397. [Google Scholar] [CrossRef] [PubMed]
Wallace, C. Eliciting Priors and Relaxing the Single Causal Variant Assumption in Colocalisation Analyses. PLoS Genet. 2020, 16, e1008720. [Google Scholar] [CrossRef]
Donovan, A.; Lima, C.A.; Pinkus, J.L.; Pinkus, G.S.; Zon, L.I.; Robine, S.; Andrews, N.C. The Iron Exporter Ferroportin/Slc40a1 Is Essential for Iron Homeostasis. Cell Metab. 2005, 1, 191–200. [Google Scholar] [CrossRef]
Boyd, H.A.; Junker, T.G.; Biering-Sørensen, T.; Jan Wohlfahrt, J.; Hviid, A. SARS-CoV-2 Infection and Long-Term Risk of Cardiovascular and Renal Morbidity. medRxiv 2025. [Google Scholar] [CrossRef]
López-Hernández, Y.; Monárrez-Espino, J.; López, D.A.G.; Zheng, J.; Borrego, J.C.; Torres-Calzada, C.; Elizalde-Díaz, J.P.; Mandal, R.; Berjanskii, M.; Martínez-Martínez, E.; et al. The Plasma Metabolome of Long COVID Patients Two Years after Infection. Sci. Rep. 2023, 13, 12420. [Google Scholar] [CrossRef]
WHO. COVID-19 Cases WHO COVID-19 Dashboard; WHO: Geneva, Switzerland, 2024. [Google Scholar]
Guo, H.; Li, T.; Wen, H. Identifying Shared Genetic Loci between Coronavirus Disease 2019 and Cardiovascular Diseases Based on Cross-Trait Meta-Analysis. Front. Microbiol. 2022, 13, 993933. [Google Scholar] [CrossRef]
Lee, C.H.; Shi, H.; Pasaniuc, B.; Eskin, E.; Han, B. PLEIO: A Method to Map and Interpret Pleiotropic Loci with GWAS Summary Statistics. Am. J. Hum. Genet. 2021, 108, 36–48. [Google Scholar] [CrossRef]
Koskeridis, F.; Fancy, N.; Tan, P.F.; Meena, D.; Evangelou, E.; Elliott, P.; Wang, D.; Matthews, P.M.; Dehghan, A.; Tzoulaki, I. Multi-Trait Association Analysis Reveals Shared Genetic Loci between Alzheimer’s Disease and Cardiovascular Traits. Nat. Commun. 2024, 15, 9827. [Google Scholar] [CrossRef] [PubMed]
Pietzner, M.; Wheeler, E.; Carrasco-Zanini, J.; Raffler, J.; Kerrison, N.D.; Oerton, E.; Auyeung, V.P.W.; Luan, J.; Finan, C.; Casas, J.P.; et al. Genetic Architecture of Host Proteins Involved in SARS-CoV-2 Infection. Nat. Commun. 2020, 11, 6397. [Google Scholar] [CrossRef] [PubMed]
Israeli, M.; Finkel, Y.; Yahalom-Ronen, Y.; Paran, N.; Chitlaru, T.; Israeli, O.; Cohen-Gihon, I.; Aftalion, M.; Falach, R.; Rotem, S.; et al. Genome-Wide CRISPR Screens Identify GATA6 as a Proviral Host Factor for SARS-CoV-2 via Modulation of ACE2. Nat. Commun. 2022, 13, 2237. [Google Scholar] [CrossRef] [PubMed]
Sharma, A.; Wasson, L.K.; Willcox, J.A.L.; Morton, S.U.; Gorham, J.M.; Delaughter, D.M.; Neyazi, M.; Schmid, M.; Agarwal, R.; Jang, M.Y.; et al. GATA6 Mutations in HiPSCs Inform Mechanisms for Maldevelopment of the Heart, Pancreas, and Diaphragm. Elife 2020, 9, e53278. [Google Scholar] [CrossRef]
Claussnitzer, M.; Dankel, S.N.; Kim, K.-H.; Quon, G.; Meuleman, W.; Haugen, C.; Glunk, V.; Sousa, I.S.; Beaudry, J.L.; Puviindran, V.; et al. FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. N. Engl. J. Med. 2015, 373, 895–907. [Google Scholar] [CrossRef]
Mifsud, B.; Tavares-Cadete, F.; Young, A.N.; Sugar, R.; Schoenfelder, S.; Ferreira, L.; Wingett, S.W.; Andrews, S.; Grey, W.; Ewels, P.A.; et al. Mapping Long-Range Promoter Contacts in Human Cells with High-Resolution Capture Hi-C. Nat. Genet. 2015, 47, 598–606. [Google Scholar] [CrossRef]
Hanson, A.L.; Mulè, M.P.; Ruffieux, H.; Mescia, F.; Bergamaschi, L.; Pelly, V.S.; Turner, L.; Kotagiri, P.; Göttgens, B.; Hess, C.; et al. Iron Dysregulation and Inflammatory Stress Erythropoiesis Associates with Long-Term Outcome of COVID-19. Nat. Immunol. 2024, 25, 471–482. [Google Scholar] [CrossRef] [PubMed]
Yan, F.; Li, K.; Xing, W.; Dong, M.; Yi, M.; Zhang, H. Role of Iron-Related Oxidative Stress and Mitochondrial Dysfunction in Cardiovascular Diseases. Oxid. Med. Cell. Longev. 2022, 2022, 5124553. [Google Scholar] [CrossRef]
Wang, H.; Huang, Z.; Du, C.; Dong, M. Iron Dysregulation in Cardiovascular Diseases. Rev. Cardiovasc. Med. 2024, 25, 16. [Google Scholar] [CrossRef]
Kobayashi, M.; Suhara, T.; Baba, Y.; Kawasaki, N.K.; Higa, J.K.; Matsui, T. Pathological Roles of Iron in Cardiovascular Disease. Curr. Drug Targets 2018, 19, 1068–1076. [Google Scholar] [CrossRef]
Guo, S.; Mao, X.; Li, X.; Ouyang, H. Association between Iron Status and Incident Coronary Artery Disease: A Population Based-Cohort Study. Sci. Rep. 2022, 12, 17490. [Google Scholar] [CrossRef]
Thompson, E.A.; Cascino, K.; Ordonez, A.A.; Zhou, W.; Vaghasia, A.; Hamacher-Brady, A.; Brady, N.R.; Sun, I.H.; Wang, R.; Rosenberg, A.Z.; et al. Metabolic Programs Define Dysfunctional Immune Responses in Severe COVID-19 Patients. Cell Rep. 2021, 34, 108863. [Google Scholar] [CrossRef]
Virga, F.; Taverna, D.; Ferrero, G.; Leclercq, M.; El Hachem, N.; Godoy-Tena, G.; Jacobs, C.; Tarallo, S.; Pardini, B.; Naccarati, A.; et al. Transcriptome Changes in Circulating Immune Cells of Critical COVID-19 Patients Predict a Specific Metabolic and Epigenetic Imprint. J. Transl. Med. 2026, 24, 247. [Google Scholar] [CrossRef] [PubMed]
Abdelmoaty, M.M.; Yeapuri, P.; Machhi, J.; Olson, K.E.; Shahjin, F.; Kumar, V.; Zhou, Y.; Liang, J.; Pandey, K.; Acharya, A.; et al. Defining the Innate Immune Responses for SARS-CoV-2-Human Macrophage Interactions. Front. Immunol. 2021, 12, 741502. [Google Scholar] [CrossRef] [PubMed]
Gupta, S.; Hemeg, H.A.; Afrin, F. Immuno-Epigenetic Paradigms in Coronavirus Infection. Front. Immunol. 2025, 16, 1596135. [Google Scholar] [CrossRef]
Suhm, T.; Kaimal, J.M.; Dawitz, H.; Peselj, C.; Masser, A.E.; Hanzén, S.; Ambrožič, M.; Smialowska, A.; Björck, M.L.; Brzezinski, P.; et al. Mitochondrial Translation Efficiency Controls Cytoplasmic Protein Homeostasis. Cell Metab. 2018, 27, 1309–1322.e6. [Google Scholar] [CrossRef]
Topf, U.; Suppanz, I.; Samluk, L.; Wrobel, L.; Böser, A.; Sakowska, P.; Knapp, B.; Pietrzyk, M.K.; Chacinska, A.; Warscheid, B. Quantitative Proteomics Identifies Redox Switches for Global Translation Modulation by Mitochondrially Produced Reactive Oxygen Species. Nat. Commun. 2018, 9, 324. [Google Scholar] [CrossRef]
Hofmann, C.; Serafin, A.; Schwerdt, O.M.; Fischer, J.; Sicklinger, F.; Younesi, F.S.; Byrne, N.J.; Meyer, I.S.; Malovrh, E.; Sandmann, C.; et al. Transient Inhibition of Translation Improves Cardiac Function after Ischemia/Reperfusion by Attenuating the Inflammatory Response. Circulation 2024, 150, 1248–1267. [Google Scholar] [CrossRef]
Li, H.; Wen, J.; Zhang, X.; Dai, Z.; Liu, M.; Zhang, H.; Zhang, N.; Lei, R.; Luo, P.; Zhang, J. Large-Scale Genetic Correlation Studies Explore the Causal Relationship and Potential Mechanism between Gut Microbiota and COVID-19-Associated Risks. BMC Microbiol. 2024, 24, 292. [Google Scholar] [CrossRef] [PubMed]
Huang, X.; Yao, M.; Tian, P.; Wong, J.Y.Y.; Liu, Z.; Zhao, J.V. Shared Genetic Etiology and Causality between COVID-19 and Venous Thromboembolism: Evidence from Genome-Wide Cross Trait Analysis and Bi-Directional Mendelian Randomization Study. medRxiv 2022. [Google Scholar] [CrossRef]
Butler-Laporte, G.; Povysil, G.; Kosmicki, J.A.; Cirulli, E.T.; Drivas, T.; Furini, S.; Saad, C.; Schmidt, A.; Olszewski, P.; Korotko, U.; et al. Exome-Wide Association Study to Identify Rare Variants Influencing COVID-19 Outcomes: Results from the Host Genetics Initiative. PLoS Genet. 2022, 18, e1010367. [Google Scholar] [CrossRef] [PubMed]
D’Antonio, M.; Nguyen, J.P.; Arthur, T.D.; Matsui, H.; D’Antonio-Chronowska, A.; Frazer, K.A.; Neale, B.M.; Daly, M.; Ganna, A.; Stevens, C.; et al. SARS-CoV-2 Susceptibility and COVID-19 Disease Severity Are Associated with Genetic Variants Affecting Gene Expression in a Variety of Tissues. Cell Rep. 2021, 37, 110020. [Google Scholar] [CrossRef]
Aragam, K.G.; Jiang, T.; Goel, A.; Kanoni, S.; Wolford, B.N.; Atri, D.S.; Weeks, E.M.; Wang, M.; Hindy, G.; Zhou, W.; et al. Discovery and Systematic Characterization of Risk Variants and Genes for Coronary Artery Disease in over a Million Participants. Nat. Genet. 2022, 54, 1803–1815. [Google Scholar] [CrossRef]
Barrett, T.; Wilhite, S.E.; Ledoux, P.; Evangelista, C.; Kim, I.F.; Tomashevsky, M.; Marshall, K.A.; Phillippy, K.H.; Sherman, P.M.; Holko, M.; et al. NCBI GEO: Archive for Functional Genomics Data Sets—Update. Nucleic Acids Res. 2013, 41, D991–D995. [Google Scholar] [CrossRef] [PubMed]
Auton, A.; Abecasis, G.R.; Altshuler, D.M.; Durbin, R.M.; Bentley, D.R.; Chakravarti, A.; Clark, A.G.; Donnelly, P.; Eichler, E.E.; Flicek, P.; et al. A Global Reference for Human Genetic Variation. Nature 2015, 526, 68–74. [Google Scholar] [CrossRef] [PubMed]
Byrska-Bishop, M.; Evani, U.S.; Zhao, X.; Basile, A.O.; Abel, H.J.; Regier, A.A.; Corvelo, A.; Clarke, W.E.; Musunuri, R.; Nagulapalli, K.; et al. High-Coverage Whole-Genome Sequencing of the Expanded 1000 Genomes Project Cohort Including 602 Trios. Cell 2022, 185, 3426–3440.e19. [Google Scholar] [CrossRef] [PubMed]
Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.R.; Bender, D.; Maller, J.; Sklar, P.; De Bakker, P.I.W.; Daly, M.J.; et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef]
Bulik-Sullivan, B.; Finucane, H.K.; Anttila, V.; Gusev, A.; Day, F.R.; Loh, P.R.; Duncan, L.; Perry, J.R.B.; Patterson, N.; Robinson, E.B.; et al. An Atlas of Genetic Correlations across Human Diseases and Traits. Nat. Genet. 2015, 47, 1236–1241. [Google Scholar] [CrossRef]
Gazal, S.; Finucane, H.K.; Furlotte, N.A.; Loh, P.R.; Palamara, P.F.; Liu, X.; Schoech, A.; Bulik-Sullivan, B.; Neale, B.M.; Gusev, A.; et al. Linkage Disequilibrium-Dependent Architecture of Human Complex Traits Shows Action of Negative Selection. Nat. Genet. Correction in Nat. Genet. 2019, 51, 1295. https://doi.org/10.1038/s41588-019-0468-x.. 2017, 49, 1421–1427. [Google Scholar] [CrossRef]
Chang, C.C.; Chow, C.C.; Tellier, L.C.A.M.; Vattikuti, S.; Purcell, S.M.; Lee, J.J. Second-Generation PLINK: Rising to the Challenge of Larger and Richer Datasets. Gigascience 2015, 4, s13742–015–0047–8. [Google Scholar] [CrossRef]
Fadista, J.; Manning, A.K.; Florez, J.C.; Groop, L. The (in)Famous GWAS P-Value Threshold Revisited and Updated for Low-Frequency Variants. Eur. J. Hum. Genet. 2016, 24, 1202–1205. [Google Scholar] [CrossRef] [PubMed]
Ward, L.D.; Kellis, M. HaploReg v4: Systematic Mining of Putative Causal Variants, Cell Types, Regulators and Target Genes for Human Complex Traits and Disease. Nucleic Acids Res. 2016, 44, D877–D881. [Google Scholar] [CrossRef]
Zhbannikov, I.Y.; Arbeev, K.; Ukraintseva, S.; Yashin, A.I. HaploR: An R Package for Querying Web-Based Annotation Tools. F1000Research 2017, 6, 97. [Google Scholar] [CrossRef]
Foley, C.N.; Staley, J.R.; Breen, P.G.; Sun, B.B.; Kirk, P.D.W.; Burgess, S.; Howson, J.M.M. A Fast and Efficient Colocalization Algorithm for Identifying Shared Genetic Risk Factors across Multiple Traits. Nat. Commun. 2021, 12, 764. [Google Scholar] [CrossRef]
Liu, B.; Gloudemans, M.J.; Rao, A.S.; Ingelsson, E.; Montgomery, S.B. Abundant Associations with Gene Expression Complicate GWAS Follow-Up. Nat. Genet. 2019, 51, 768–769. [Google Scholar] [CrossRef]
Giambartolomei, C.; Vukcevic, D.; Schadt, E.E.; Franke, L.; Hingorani, A.D.; Wallace, C.; Plagnol, V. Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics. PLoS Genet. 2014, 10, e1004383. [Google Scholar] [CrossRef] [PubMed]
Wang, G.; Sarkar, A.; Carbonetto, P.; Stephens, M. A Simple New Approach to Variable Selection in Regression, with Application to Genetic Fine Mapping. J. R. Stat. Soc. Ser. B Stat. Methodol. 2020, 82, 1273–1300. [Google Scholar] [CrossRef]
Wallace, C. A More Accurate Method for Colocalisation Analysis Allowing for Multiple Causal Variants. PLoS Genet. 2021, 17, e1009440. [Google Scholar] [CrossRef]
Chen, W.; Larrabee, B.R.; Ovsyannikova, I.G.; Kennedy, R.B.; Haralambieva, I.H.; Poland, G.A.; Schaid, D.J. Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics. Genetics 2015, 200, 719–736. [Google Scholar] [CrossRef]
Wakefield, J. Bayes Factors for Genome-Wide Association Studies: Comparison with P-Values. Genet. Epidemiol. 2009, 33, 79–86. [Google Scholar] [CrossRef] [PubMed]
Xue, A.; Zhu, Z.; Wang, H.; Jiang, L.; Visscher, P.M.; Zeng, J.; Yang, J. Unravelling the Complex Causal Effects of Substance Use Behaviours on Common Diseases. Commun. Med. 2024, 4, 43. [Google Scholar] [CrossRef]
Zhu, Z.; Zheng, Z.; Zhang, F.; Wu, Y.; Trzaskowski, M.; Maier, R.; Robinson, M.R.; McGrath, J.J.; Visscher, P.M.; Wray, N.R.; et al. Causal Associations between Risk Factors and Common Diseases Inferred from GWAS Summary Data. Nat. Commun. 2018, 9, 224. [Google Scholar] [CrossRef]
Medway, C.; Shi, H.; Bullock, J.; Black, H.; Brown, K.; Vafadar-Isfahani, B.; Matharoo-Ball, B.; Ball, G.; Rees, R.; Kalsheker, N.; et al. Using in Silico LD Clumping and Meta-Analysis of Genomewide Datasets as a Complementary Tool to Investigate and Validate New Candidate Biomarkers in Alzheimer’s Disease. Int. J. Mol. Epidemiol. Genet. 2010, 1, 134–144. [Google Scholar]
Li, A.; Liu, S.; Bakshi, A.; Jiang, L.; Chen, W.; Zheng, Z.; Sullivan, P.F.; Visscher, P.M.; Wray, N.R.; Yang, J.; et al. MBAT-Combo: A More Powerful Test to Detect Gene-Trait Associations from GWAS Data. Am. J. Hum. Genet. 2023, 110, 30–43. [Google Scholar] [CrossRef] [PubMed]
Pruim, R.J.; Welch, R.P.; Sanna, S.; Teslovich, T.M.; Chines, P.S.; Gliedt, T.P.; Boehnke, M.; Abecasis, G.R.; Willer, C.J.; Frishman, D. LocusZoom: Regional Visualization of Genome-Wide Association Scan Results. Bioinformatics 2011, 27, 2336–2337. [Google Scholar] [CrossRef] [PubMed]
Myers, T.A.; Chanock, S.J.; Machiela, M.J. LDlinkR: An R Package for Rapidly Calculating Linkage Disequilibrium Statistics in Diverse Populations. Front. Genet. 2020, 11, 157. [Google Scholar] [CrossRef]
Love, M.I.; Huber, W.; Anders, S. Moderated Estimation of Fold Change and Dispersion for RNA-Seq Data with DESeq2. Genome Biol. 2014, 15, 550. [Google Scholar] [CrossRef] [PubMed]
Anders, S.; Huber, W. Differential Expression Analysis for Sequence Count Data. Genome Biol. 2010, 11, R106. [Google Scholar] [CrossRef]
Huber, W.; von Heydebreck, A.; Sueltmann, H.; Poustka, A.; Vingron, M. Parameter Estimation for the Calibration and Variance Stabilization of Microarray Data. Stat. Appl. Genet. Mol. Biol. 2005, 2, 1008. [Google Scholar] [CrossRef]
Tibshirani, R. Estimating Transformations for Regression via Additivity and Variance Stabilization. J. Am. Stat. Assoc. 1988, 83, 394–405. [Google Scholar] [CrossRef]
Zhu, A.; Ibrahim, J.G.; Love, M.I. Heavy-Tailed Prior Distributions for Sequence Count Data: Removing the Noise and Preserving Large Differences. Bioinformatics 2019, 35, 2084–2092. [Google Scholar] [CrossRef]
Wickham, H. Ggplot2: Elegant Graphics for Data Analysis; Springer: New York, NY, USA, 2016; Volume 35. [Google Scholar]
Kolde, R.; Maintainer, R.K. Pheatmap: Pretty Heatmaps; R. Package: New York, NY, USA, 2015. [Google Scholar]
Xu, S.; Hu, E.; Cai, Y.; Xie, Z.; Luo, X.; Zhan, L.; Tang, W.; Wang, Q.; Liu, B.; Wang, R.; et al. Using ClusterProfiler to Characterize Multiomics Data. Nat. Protoc. 2024, 19, 3292–3320. [Google Scholar] [CrossRef] [PubMed]
Yu, G. ClusterProfiler: An Universal Enrichment Tool for Functional and Comparative Study. BioRxiv 2018. [Google Scholar] [CrossRef]
Yu, G.; Wang, L.G.; Han, Y.; He, Q.Y. ClusterProfiler: An R Package for Comparing Biological Themes among Gene Clusters. Omics J. Integr. Biol. 2012, 16, 284–287. [Google Scholar] [CrossRef]
Wu, T.; Hu, E.; Xu, S.; Chen, M.; Guo, P.; Dai, Z.; Feng, T.; Zhou, L.; Tang, W.; Zhan, L.; et al. ClusterProfiler 4.0: A Universal Enrichment Tool for Interpreting Omics Data. Innovation 2021, 2, 100141. [Google Scholar] [CrossRef]
Yu, G. Thirteen Years of ClusterProfiler. Innovation 2024, 5, 100722. [Google Scholar] [CrossRef] [PubMed]
Liberzon, A.; Subramanian, A.; Pinchback, R.; Thorvaldsdóttir, H.; Tamayo, P.; Mesirov, J.P. Molecular Signatures Database (MSigDB) 3.0. Bioinformatics 2011, 27, 1739–1740. [Google Scholar] [CrossRef] [PubMed]
Liberzon, A.; Birger, C.; Thorvaldsdóttir, H.; Ghandi, M.; Mesirov, J.P.; Tamayo, P. The Molecular Signatures Database Hallmark Gene Set Collection. Cell Syst. 2015, 1, 417–425. [Google Scholar] [CrossRef] [PubMed]
Subramanian, A.; Tamayo, P.; Mootha, V.K.; Mukherjee, S.; Ebert, B.L.; Gillette, M.A.; Paulovich, A.; Pomeroy, S.L.; Golub, T.R.; Lander, E.S.; et al. Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-Wide Expression Profiles. Proc. Natl. Acad. Sci. USA 2005, 102, 15545–15550. [Google Scholar] [CrossRef]
Benjamini, Y.; Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 1995, 57, 289–300. [Google Scholar] [CrossRef]
Aleksander, S.A.; Balhoff, J.; Carbon, S.; Cherry, J.M.; Drabkin, H.J.; Ebert, D.; Feuermann, M.; Gaudet, P.; Harris, N.L.; Hill, D.P.; et al. The Gene Ontology Knowledgebase in 2023. Genetics 2023, 224, iyad031. [Google Scholar] [CrossRef]
Köhler, S.; Gargano, M.; Matentzoglu, N.; Carmody, L.C.; Lewis-Smith, D.; Vasilevsky, N.A.; Danis, D.; Balagura, G.; Baynam, G.; Brower, A.M.; et al. The Human Phenotype Ontology in 2021. Nucleic Acids Res. 2021, 49, D1207–D1217. [Google Scholar] [CrossRef]

Figure 1. Visualization of GWAS-GWAS colocalization events and Venn Diagram. GWAS-GWAS LocusCompareR plots (A–C) where each dot represents a SNP in the region of interest (chr1:77,695,983–78,192,445 (1p31.1), chr8:21,773,384–22,270,797 (8p21.3), and chr18:19,748,905–20,248,798) (18q11.2) respectively) are shown. The labeled SNPs highlighted with a purple squares are the lead SNP (for both traits) and other SNPs are colored according to their LD (r²) with the lead SNPs present in their respective loci. The X and Y axes represent −log10 p values from CAD, and COVID-19 genome-wide association studies. (D) Venn diagram presents a systematic review of Bayesian, HyPrColoc, and LocusCompareR methods used for defining colocalization.

Figure 2. Sensitivity analysis and robustness of shared causal signals for both COVID-19 and CAD under the assumption that H3 and H4 are approximately equal. The left panels display local Manhattan plots of the GWAS summary statistics, displaying −log10 p-values for the 1p31.1, 8p21.3, and 18q11.2 loci with fine mapped signals in purple dots. The right panels show the prior and posterior probabilities for hypotheses H₀ through H₄ as a function of the prior probability of colocalization (P₁₂ < 5×10⁻⁶). A vertical line indicates the threshold used for sensitivity checks. Blue and green points highlight SNPs in strong linkage disequilibrium (r² > 0.8) with the prioritized causal variants. Panels (A–G) show the allelic effects of the lead SNPs on COVID-19 and CAD using the EUR data, while panels (H–K) provide the corresponding effects using SAS data from the 1000 Genomes Project.

Figure 3. Scatter plots showing the genetic effect of a leading SNP associated with CAD on the exposure to COVID-19 and vice versa determined through bidirectional GSMR at the three pleiotropic loci (A–C). The X-axis represents the effect size from CAD. The Y-axis represents the effect size from COVID-19 The dash line denotes the regression slop, while horizontal and vertical lines represent the effect size of each variant.

Figure 4. LocusZoomR regional plots (panels A–C) showing the spatial association of one lead SNP for COVID-19 and one for CAD with likely candidate genes at the three pleiotropic loci (1p31.1), (8p21.3), and (18q11.2) respectively. Lead SNPs are denoted in the figures. The red lines denote prioritized genes identified in the gene level analysis with the length of each line corresponding to the physical size of the respective genes.

Figure 5. Altered gene expression in peripheral mononucleate blood cells (PBMCs) of patients with COVID-19. (A) Principal component analysis (PCA) cluster plot of GSE202805 dataset, illustrating transcriptomic profiles from PBMCs of individuals with COVID-19 (red), and healthy subject (blue). PC1 and PC2 denote the maximum variance in gene expression. (B). Volcano plot showing differential gene expression in PBMCs from COVID-19 patients and healthy. The X-axis denotes −log2 fold change, and the Y-axis denotes −log 10 p-value. The blue dots denote significantly downregulated genes, and the red dots denote significantly overexpressed genes (p-value < 0.05). DMTN and PIWIL2 are labeled with light green dots. (C). The heatmap visualizing gene expression patterns across subjects with COVID-19, and healthy. The dendrograms show differentially expressed genes in the lines. In the heatmap, each gene is mapped to a color gradient representing its normalized expression level across the samples (from bright blue = low level of expression to dark red = high level of expression).

Figure 6. Gene set enrichment analysis of differentially expressed genes in peripheral mononucleate blood cells of patients with COVID-19 versus healthy. (A) Hallmark of heme metabolism from hallmark gene sets enrichment analysis with DMTN over expression by GSEA. The color band in the bottom represents up (left) to down (right) trend of genes expression in the hallmark (B) Overexpressed DMTN implicating enriched pathways from Ontology gene sets (C). Enriched pathway implicated by lower expression of PIWIL2 from Ontology gene sets. The color band (B,C) in the bottom represents down (left) to up (right) trend of genes expression in the respective pathways.

Table 1. Genetic correlations between COVID-19 phenotypes and CAD in EUR and SAS populations.

Global Genetic Correlation Between COVID-19 and CAD for European Ancestry
Phenotype 1 (COVID-19)	Phenotype 2 (CAD)		R^G	SE^G	COV^G	SE^COV	Z score	P_R^G
Critically ill	Coronary artery disease		1.02 × 10⁻¹	1.98 × 10⁻²	4 × 10⁻⁴	6.98 × 10⁻⁵	5.19	p < 2.03 × 10⁻⁷
Moderate to severe hospitalized			1.51 × 10⁻¹	1.79 × 10⁻²	6 × 10⁻⁴	7.03 × 10⁻⁵	8.45	p < 2.67 × 10⁻¹⁷
SARS-CoV-2 reported cases			1.21 × 10⁻¹	1.21 × 10⁻¹	4 × 10⁻⁴	6.72 × 10⁻⁵	6.60	p < 4.04 × 10⁻¹¹
Global genetic correlation between COVID-19 and CAD for South Asian ancestry including Lahore Punjabi population in Pakistan
Critically ill	Coronary artery disease		9.38 × 10⁻²	2.33 × 10⁻²	4 × 10⁻⁴	1.0 × 10⁻⁴	4.02	p < 5.70 × 10⁻⁵
Moderate to severe hospitalized			1.57 × 10⁻¹	2.32 × 10⁻²	8 × 10⁻⁴	1.0 × 10⁻⁴	6.79	p < 1.07 × 10⁻¹¹
SARS-CoV-2 reported cases			1.34 × 10⁻¹	2.44 × 10⁻²	6 × 10⁻⁴	1.0 × 10⁻⁴	5.50	p < 3.68 × 10⁻⁸
Local genetic correlation between COVID-19 and CAD for European ancestry
Phenotype 1 (COVID-19)	Phenotype 2 (CAD)	Coordinate/Locus	R^G	SE^G	COV^G	SE^COV	Z score	P_R^G
Moderate to severe hospitalized	Coronary artery disease	Chr1:77695983_78192445 1p31.1 (0.49 Mb)	6.58 × 10⁻¹	0.17	1.17 × 10⁻²	5.5 × 10⁻³	3.76	p < 2.0 × 10⁻⁴
		Chr8:21773384_22270797 8p21.3 (0.49 Mb)	−8.11 × 10⁻¹	0.17	−2.8 × 10⁻²	7.4 × 10⁻³	−4.54	p < 5.45 × 10⁻⁶
		Chr18:19748905_20248798 18q11.2 (0.49 Mb)	4.76 × 10⁻¹	−0.12	3.9 × 10⁻²	−1.25 × 10⁻²	3.94	p < 7.87 × 10⁻⁵
Local genetic correlation between COVID-19 and CAD for South Asian ancestry including Lahore Punjabi population in Pakistan
Moderate to severe hospitalized	Coronary artery disease	Chr8:21773384_22270797 8p21.3 (0.49 Mb)	−8.78 × 10⁻¹	0.16	−4.04 × 10⁻²	1.16 × 10⁻²	−5.39	p < 6.92 × 10⁻⁸
Moderate to severe hospitalized	Coronary artery disease	Chr18:19748905_20248798 18q11.2 (0.49 Mb)	9.22 × 10⁻¹	0.20	4.76 × 10⁻²	2.71 × 10⁻²	4.46	p < 7.92 × 10⁻⁶

Global genetic correlation was estimated using genome-wide linkage disequilibrium score regression (LDSC) to quantify the genome-wide shared genetic architecture between the two traits. Conversely, local genetic correlation was assessed for the three identified loci (1p31.1, 8p21.3, and 18q11.2) using an integrated colocalization framework. Both analyses utilized population-specific LD reference panels from the 1000 Genomes Project (1KGP) for European (EUR) and South Asian (SAS) ancestries. R^G = Genetic correlation; SE^G = Standard error of R^G; COV^G = Genetic covariance; SE^COV = Standard error of COV^G; P_R^G = p-value of R^G.

Table 2. Coincident pleotropic signals between COVID-19 and CAD.

COVID-19 Phenotype	Coordinate/Locus	COVID-19	CAD	PPH_0(abf)	PPH_1(abf)	PPH_2(abf)	PPH_3(abf)	PPH_4(abf)	Ancestry
Critically ill	Chr1:77695983_78192445 1p31.1 (0.49 Mb)	rs7515509	rs2133204	2.20 × 10⁻¹	2.42 × 10⁻²	5.06 × 10⁻²	4.14 × 10⁻³	0.70	EUR
Moderate to severe hospitalized	Chr1:77695983_78192445 1p31.1 (0.49 Mb)	rs7515509	rs2133204	2.43 × 10⁻⁵	3.23 × 10⁻²	1.02 × 10⁻⁵	1.16 × 10⁻²	0.96	EUR
Critically ill	Chr8:21773384_22270797 8p21.3 (0.49 Mb)	rs8192327	rs56390102	7.45 × 10⁻⁵	1.75 × 10⁻⁵	2.38 × 10⁻¹	5.45 × 10⁻²	0.71	EUR
Moderate to severe hospitalized	Chr8:21773384_22270797 8p21.3 (0.49 Mb)	rs8192330	rs56408342	2.49 × 10⁻⁹	3.51 × 10⁻⁸	9.94 × 10⁻³	1.38 × 10⁻¹	0.85	EUR
Critically ill	Chr18:19748905_20248798 18q11.2 (0.49 Mb)	rs4800403	rs16967171	1.60 × 10⁻⁵	7.72 × 10⁻⁵	2.40 × 10⁻³	9.61 × 10⁻³	0.99	EUR
Moderate to severe hospitalized		rs4800403	rs3813126	2.05 × 10⁻⁶	7.75 × 10⁻⁵	1.64 × 10⁻³	6.02 × 10⁻²	0.94	EUR
SARS-CoV-2 reported cases		rs16967171	rs16967171	7.67 × 10⁻⁶	2.96 × 10⁻³	1.56 × 10⁻⁵	4.04 × 10⁻³	0.99	EUR
Critically ill	Chr8:21773384_22270797 8p21.3 (0.49 Mb)	rs56390102	rs56390102	8.60 × 10⁻⁴	1.11 × 10⁻⁵	2.55 × 10⁻¹	1.83 × 10⁻³	0.74	SAS
Critically ill	Chr18:19748905_20248798 18q11.2 (0.49 b)	rs4800403	rs16967171	3.29 × 10⁻⁴	1.59 × 10⁻³	3.38 × 10⁻³	1.43 × 10⁻²	0.98	SAS
Moderate to severe		rs4800403	rs12958355	4.70 × 10⁻¹¹	9.83 × 10⁻¹⁹	1.66 × 10⁻³	3.27 × 10⁻²	0.97	SAS
SARS-CoV-2 reported cases		rs16967171	rs12958355	5.44 × 10⁻¹²	2.04 × 10⁻¹⁰	2.47 × 10⁻⁴	7.27 × 10⁻³	0.99	SAS

Table 3. Bidirectional generalized summary data-based Mendelian randomization (GSMR) analysis showing the cause and exposure relationships at the three overlapping genetic loci.

Locus	Coordinates	$\hat{β}$ _{zx_{(forward-GSMR)}}	$\hat{β}$ _{zy _{y(reverse-GSMR)}}	SE_{zx(forward-GSMR)}	SE_{zy(reverse-GSMR)}	P_{Adj(forward-GSMR)}	P_{Adj(reverse-GSMR)}
1p31.1	1_77695983_78192445	4.92	2.4 × 10⁻³	1.72	1.10 × 10⁻²	4.21 × 10⁻³	0.83NS
8p21.3	8_21773384_22270797	–0.61	0.10	0.28	2.71 × 10⁻²	3.19 ×10⁻²	2.20 × 10⁻⁴
18q11.2	18_19748905_20248798	2.07	0.16	0.44	3.02 × 10⁻²	2.30 × 10⁻⁶	1.72 × 10⁻⁷

Causal effect estimates for bidirectional GSMR—treating CAD and COVID-19 as both exposure and outcome across the three loci—are presented with their respective p-values and standard errors. Coordinates: Genomic location;

\hat{β}

_{zx(forward-GSMR)}= Effect value for forward-GSMR where instrumental variables from CAD serving as exposures;

\hat{β}

_{zy(reverse-GSMR)}: Effect value for reverse-GSMR where instrumental variables for COVID-19 were taken as exposure; SE_{zx(forward-GSMR)}: Standard error of

\hat{β}

_{zx(forward-GSMR)}; SE_{zx(reverse-GSMR)}: Standard error of

\hat{β}

_{zy(reverse-GSMR)}; P_{Adj(forward-GSMR)}: Benjamini–Hochberg (FDR) adjusted p-value for forward-GSMR; P_{Adj(reverse-GSMR):} Benjamini–Hochberg (FDR) adjusted p-value for reverse-GSMR.

Table 4. Statistical associations of fine mapped lead signals or pleotropic signals with candidate genes prioritized in the gene level analyses through mBAT-combo at the three pleiotropic loci.

Gene	Genic Coordinates	Locus	Start SNP	End SNP	TopSNP	P_TopSNP	Eig.	mBAT_Chisq	P_mBAT-Combo	P_mBAT
PIGK	1_77088989_77219430	1p31.1	rs11162292	rs1963170	rs7515509	2.94 × 10⁻¹²	19	52.81	9.08 × 10⁻⁶	4.99 × 10⁻⁵
AK5	1_77282019_77559966	1p31.1	rs11162292	rs1963170	rs7515509	2.94 × 10⁻¹²	19	52.81	9.08 × 10⁻⁶	4.99 × 10⁻⁵
ZZZ3	1_77562416_77683419	1p31.1	rs11162292	rs1963170	rs7515509	2.94 × 10⁻¹²	19	52.81	9.08 × 10⁻⁶	4.99 × 10⁻⁵
USP33	1_77695987_77759852	1p31.1	rs11162292	rs1963170	rs7515509	2.94 × 10⁻¹²	19	52.81	9.08 × 10⁻⁶	4.99 × 10⁻⁵
DOK2	8_21908873_21913690	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
XPO7	8_21919662_22006585	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
NPM2	8_22024125_22036897	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
FGF17	8_22042398_22048809	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
DMTN	8_22048995_22082527	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
FHIP2B	8_22089150_22104911	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
NUDT18	8_22106874_22109419	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94×10⁻³
HR	8_22114419_22133384	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
HRURF	8_22130604_22130708	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
REEP4	8_22138020_22141951	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
LGI3	8_22146830_22157084	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
SFTPC	8_22156913_22164479	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
BMP1	8_22165140_22212326	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
PHYHIP	8_22219703_22232101	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
POLR3D	8_22245133_22254601	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
PIWIL2	8_22275316_22357568	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻7	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
SLC39A14	8_22367278_22434129	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
PPP3CC	8_22440819_22541142	8p21.3	rs34802507	rs11777848	rs8192330	2.83 × 10⁻⁷	28	54.51	6.46 × 10⁻⁶	1.94 × 10⁻³
GATA6	18_22169589_22202528	18q11.2	rs9949157	rs12955964	rs4800403	1.27 × 10⁻⁷	16	55.79	1.43 × 10⁻⁶	2.63 × 10⁻⁶
CTAGE1	18_22413599_22417915	18q11.2	rs9949157	rs12955964	rs4800403	1.27 × 10⁻⁷	16	55.79	1.43 × 10⁻⁶	2.63 × 10⁻⁶

The results show the genomic range of each gene and locus (GRCh38 build), including the flanking SNPs, and the SNPs identified through association analysis. Gene: Likely casual genes; TopSNP: Lead SNPs; TopSNP Pvalue: p-value for TopSNP; Eig: No. of Eigenvalues; χ2: Chi-square (χ2) values; P_mBAT-Combo: mBAT-combo p-value; P_mBAT: mBAT p-value.

Table 5. Validation of prioritized candidate genes using bulk RNA-seq expression profiling data.

Gene	ID	baseMean	log2FoldChange	lfcSE	stat	P_DESeq2	P_(Adj)-DESeq2
DMTN	ENSG00000158856	2198.67	2.42	0.22	10.98	4.75 × 10⁻²⁸	1.60 × 10⁻²⁵
PIWIL2	ENSG00000197181	8.71	−2.18	0.89	−2.44	1.45 × 10⁻²	4.21 × 10⁻²

Gene: Candidate gene; ID: Ensembl ID for Gene; baseMean: Mean of normalized counts from all samples for Gene; log2FoldChange: Log2 Fold Change value of Gene; lfcSE: Standard error of Log2 Fold Change; stat: Wald statistic; P_DESeq2: DESeq2 p-value; P_(Adj)-DESeq2: DESeq2 FDR adjusted p-value.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ali, M.S.; Haider, W.; Aziz, S.; Mohammad, A.; Manichaikul, A.; Shi, W. A Post-GWAS Analysis of the Shared Genetic Architecture Between COVID-19 and Coronary Artery Disease. Int. J. Mol. Sci. 2026, 27, 4132. https://doi.org/10.3390/ijms27094132

AMA Style

Ali MS, Haider W, Aziz S, Mohammad A, Manichaikul A, Shi W. A Post-GWAS Analysis of the Shared Genetic Architecture Between COVID-19 and Coronary Artery Disease. International Journal of Molecular Sciences. 2026; 27(9):4132. https://doi.org/10.3390/ijms27094132

Chicago/Turabian Style

Ali, Muhammad Sarfraz, Waseem Haider, Sana Aziz, Anwaruddin Mohammad, Ani Manichaikul, and Weibin Shi. 2026. "A Post-GWAS Analysis of the Shared Genetic Architecture Between COVID-19 and Coronary Artery Disease" International Journal of Molecular Sciences 27, no. 9: 4132. https://doi.org/10.3390/ijms27094132

APA Style

Ali, M. S., Haider, W., Aziz, S., Mohammad, A., Manichaikul, A., & Shi, W. (2026). A Post-GWAS Analysis of the Shared Genetic Architecture Between COVID-19 and Coronary Artery Disease. International Journal of Molecular Sciences, 27(9), 4132. https://doi.org/10.3390/ijms27094132

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Post-GWAS Analysis of the Shared Genetic Architecture Between COVID-19 and Coronary Artery Disease

Abstract

1. Introduction

2. Results

2.1. Global Genetic Correlation Between COVID-19 and CAD

2.2. Coincident Loci for COVID-19 and CAD

2.3. Local Genetic Correlations Between COVID-19 and CAD

2.4. Coincident Pleotropic Signals Between COVID-19 and CAD

2.5. Causal Association Between COVID-19 and CAD

2.5.1. Forward Mendelian Randomization (MR) Analysis

2.5.2. Reverse MR Analysis

2.6. Likely Causal Genes at the Co-Incident Loci Shared by COVID-19 and CAD

2.7. Expression Profiling of COVID-19 Cases and Candidate Genes Prioritization

2.7.1. Hierarchical Clustering

2.7.2. Gene Set Enrichment Analysis and Pathways Altered by COVID-19 Infection

3. Discussion

4. Materials and Method

4.1. Datasets and Study Populations

4.2. Post-GWAS Analyses

4.2.1. Evaluation of Global Genetic Connections Between COVID-19 and CAD

4.2.2. Functional Genomic Coordinates or Genomic Risk Loci

4.2.3. Trait–Trait Colocalization

4.2.4. Evaluation of Local Genetic Connections Between COVID-19 and CAD

4.2.5. Fine Mapping and Prioritizing Pleotropic Variants

4.2.6. Bidirectional Mendelian Randomization (MR)

4.2.7. Gene-Level Analysis

4.3. RNA-Seq Analysis and Candidate Gene Prioritization

4.4. Gene Set Enrichment Analysis and Identification of Altered Pathways from COVID-19 Patients

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI