Next Article in Journal
Cognition, Cytokines, Blood–Brain Barrier, and Beyond in COVID-19: A Narrative Review
Next Article in Special Issue
Variants in IRF5 Increase the Risk of Primary Sjögren’s Syndrome in the Mexican Population
Previous Article in Journal
Genetic Diversity Analysis of Cotton Cultivars Using a 40K Liquid Chip in Northern Xinjiang
Previous Article in Special Issue
Genetic and Functional Characterization of STAT4 in Rheumatoid Arthritis Patients with Distinct Disease Activity
 
 
ijms-logo
Article Menu
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Identifying a Common Autoimmune Gene Core as a Tool for Verifying Biological Significance and Applicability of Polygenic Risk Scores

by
Victoria Sergeevna Shchekina
1,2,
Nikita Aleksandrovich Batashkov
1,2,
Anna Arkadievna Maznina
1,
Julia Aleksandrovna Krupinova
1,2,3,
Viktor Pavlovich Bogdanov
1,2,
Anna Vasilievna Korobeinikova
4,
Dmitry Igorevich Tychinin
4,
Olga Valentinovna Glushkova
4,
Ekaterina Sergeevna Petriaikina
4,
Dmitry Vladimirovich Svetlichnyy
4,
Mary Woroncow
2,
Vladimir Sergeevich Yudin
4,
Anton Arturovich Keskinov
4,
Sergey Mikhailovich Yudin
4,
Veronika Igorevna Skvortsova
5,
Dmitry Vyacheslavovich Tabakov
1,2,*,
Andrei Andreevich Deviatkin
4 and
Pavel Yu. Volchkov
1,2,3,*
1
Federal State Budgetary Scientific Institution “Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies”, Moscow 125315, Russia
2
Moscow Center for Advanced Studies, Kulakova Street 20, Moscow 123592, Russia
3
Moscow Clinical Scientific Center N.A. A.S. Loginov, Novogireevskaya Street, 1, 1, Moscow 111123, Russia
4
Federal State Budgetary Institution «Centre for Strategic Planning and Management of Biomedical Health Risks» of the Federal Medical and Biological Agency (Centre for Strategic Planning, of the Federal Medical and Biological Agency), Moscow 119121, Russia
5
The Federal Medical Biological Agency (FMBA of Russia), Moscow 123182, Russia
*
Authors to whom correspondence should be addressed.
Int. J. Mol. Sci. 2026, 27(1), 543; https://doi.org/10.3390/ijms27010543
Submission received: 28 November 2025 / Revised: 26 December 2025 / Accepted: 3 January 2026 / Published: 5 January 2026
(This article belongs to the Special Issue Genetics and Omics in Autoimmune Diseases)

Abstract

Polygenic autoimmune diseases (ADs) have several common features that are caused by a complex interplay of genetic and environmental factors. Common pathophysiological mechanisms include dysregulation of the immune system, chronic inflammation, and epigenetic changes influenced by external factors. For the prediction of the genetic predisposition of AD manifestation, polygenic risk scale (PRS), or polygenic scores (PGSs), are used. Use of PRSs faces several challenges such as applicability on a specific population, performance comparison, and estimation of biological relevance based on SNP number. We compared PRS with different numbers of SNPs and tried to find the common genetic core of ADs. Our analysis revealed a list of the most common altered genes, which we annotated and interpreted. Clustering of PRS based on used genes showed that clusters of ADs remained consistent across all chosen PRS sizes. We concluded that PRS size does not have an impact on biological relevance.

1. Introduction

Autoimmune diseases (ADs) are conditions resulting from an abnormal response of the adaptive immune system in which it mistakenly targets healthy, functional tissues of the organism as if they were foreign [1]. Over 100 types of autoimmune diseases have been identified [2]. Monogenic autoimmune diseases, such as Autoimmune polyendocrine syndrome type 1 and IPEX syndrome, result from pathogenic mutations in single genes (AIRE and FOXP3, respectively) that disrupt the function of the corresponding proteins [3]. Polygenic autoimmune diseases have several common features that are caused by a complex interplay of genetic and environmental factors [4]. These diseases arise from variations in multiple genes, often involving immune-related genes, such as the HLA region, and their onset can be influenced by multiple factors, such as gender (can be triggered during pregnancy or due to a disbalance of sex hormones) [2] and environmental triggers (infections, toxins, stress, and diet). Infections often act as triggers for autoinflammatory processes in genetically susceptible individuals through various mechanisms, such as activation of innate immune responses, molecular mimicry, and T-cell activation [5]. Despite these associations, the exact mechanisms of polygenic AD development are still unclear. Common pathophysiological mechanisms include dysregulation of the immune system, chronic inflammation, and epigenetic changes influenced by external factors. Clinically, these diseases often present with overlapping symptoms such as fatigue, joint pain, or organ-specific inflammation and progress in episodes, typically being challenging to diagnose and treat [6].
An overlap of two or more autoimmune diseases commonly manifests in a single patient [7]. Type 1 diabetes (T1D) is commonly associated with autoimmune thyroid diseases, such as Addison’s disease, celiac disease, Hashimoto’s thyroiditis, and Graves’ disease. Other ADs may also be associated with each other, particularly Sjögren’s syndrome with systemic lupus erythematosus and systemic sclerosis [8]. These conditions co-occur in 17–30% of cases. Other associated conditions include celiac disease (8%) and autoimmune gastritis (5–10%). However, conditions such as rheumatoid arthritis, Addison’s disease, and systemic lupus erythematosus are rarer, occurring in only 1.2%, 0.2%, and 1.15% of T1DM cases, respectively [9]. It was shown by Gokuladhas S. et al. that among 2065 SNPs associated with ADs, 186 were associated with two or more ADs simultaneously [10].
The Genome-Wide Association Study (GWAS) approach is commonly used to understand the genetic landscape that contributes to the development of autoimmune diseases [11]. Among tissue-specific autoimmune diseases, this type of analysis has been performed in type 1 diabetes mellitus (T1DM) and multiple sclerosis [12,13]. Genetic predisposition to autoimmune diseases can significantly increase the likelihood of developing these conditions. Different variants and their combinations can carry varying levels of risk, from low to high. Among the variants, there are predisposing alleles as well as protective alleles. For instance, the relative risk associated with individual genetic variants can be high, increasing the probability of developing diseases such as T1D several-fold [14,15,16]. Due to the frequent co-occurrence of ADs, it is plausible that they share common genomic variant signatures.
To find the biological relevance of GWAS findings, some studies clusterized ADs, based on GWAS summary statistics. This approach identified pleiotropic gene mutations that are involved in the pathogenesis of multiple diseases. In research by Demela et al. [17], GWAS results for nine autoimmune diseases were summarized and three tissue-specific clusters were identified based on their associated polymorphisms. The first cluster included gastrointestinal diseases (Crohn’s disease, ulcerative colitis, and primary sclerosing cholangitis), the second—systemic lupus erythematosus, T1D, and rheumatoid arthritis, and the third—allergic diseases. The study showed that in these clusters, changes occur at different nodes in the same immune pathways, and different immune cell patterns are involved in their pathogenesis.
For the prediction of genetic predisposition of AD manifestation, the polygenic risk scale (PRS) or polygenic score (PGS) are used [17,18,19]. PRSs are statistical models that summarize the effects of multiple genetic variants (SNPs) to predict disease risk and developed based on identified associations in GWAS studies. However, their interpretation and relevance between different samples remain a matter of debate.
The quality of the PRS depends on several factors, including:
Sample size and diversity: PRS that contain large and genetically diverse samples generally have better predictive power.
Population: PRS designed for one population may not work well for other populations due to genetic differences [20].
Additionally, PRSs face several challenges [21,22]:
The limited power of GWAS: GWAS studies underlying PRS often have limited statistical power, especially for rare variants or variants with small effect. This leads to many true associations of genes with the disease going unnoticed, which reduces the accuracy of PRS.
Pleiotropy: Many genes are involved in several biological processes. Therefore, the same genetic variant can be associated with several diseases, which makes it difficult to interpret the GWAS results and construct accurate PRS.
Relationship without causation: GWAS identifies statistical links between genetic variants and a disease, but does not necessarily establish a causal relationship. The association may be due to a link with the true causal variant located nearby on the chromosome.
In the literature, PRSs for various traits range from several hundred to tens of thousands of SNPs [23]. It may be assumed that a PRS based on a relatively small number of SNPs would include only highly significant variants and therefore exhibit strong predictive performance, whereas a PRS incorporating a very large number of SNPs may be less predictive. However, the number of SNPs included in a PRS is not the most important factor in its predictive quality. The most reliable way to evaluate a PRS is by assessing its predictive power on independent samples (validation). However, the comparison of even two PRSs can be challenging, as there is no universally accepted standard for the measurement of PRS effect sizes. Namely, in the PGS catalog, performance estimation can be based on the odds ratio, beta correlation, incremental AUROC, or various other factors (https://www.pgscatalog.org/). This lack of standardization complicates the interpretation and transferability of PRSs, particularly in autoimmune diseases.
To address this challenge, we selected the most common diseases for our research: rheumatoid arthritis (RA), psoriasis, Hashimoto’s thyroiditis (HT), colitis, type 1 diabetes mellitus (T1D), asthma, systemic lupus erythematosus (SLE), celiac disease, Crohn’s disease, and multiple sclerosis (Table 1). Our selection process was meticulous, aiming to encompass a range of autoimmune diseases, characterized by a generalized course, and included diseases representative of certain target affected organs or systems of organs. Additionally, we included diseases known to have an association with type 1 diabetes, thereby ensuring a comprehensive representation of autoimmune conditions with potential interconnectedness. This approach allowed us to focus our research efforts on a manageable, yet diverse set of autoimmune diseases, facilitating a more in-depth analysis of their characteristics and potential correlations.
This study aimed to assess whether different ADs can be clustered using PRSs. Consistent clustering according to PRS composition may reflect the biological relevance of risk scores. The identification of a shared “autoimmune core” of polymorphisms could form the basis for general predictive systems for autoimmune disease risk. A binary matrix of PRS-associated genes was constructed to examine clustering patterns across different PRS sizes, and the functions of commonly shared genes were analyzed to evaluate their roles in AD pathogenesis. In addition, the function and role in the AD pathogenicity process was discussed for the most common shared genes.

2. Results

2.1. Most Frequently Encountered Genes

The analysis encompassed ten autoimmune diseases (ADs), including Multiple sclerosis, Celiac disease, Rheumatoid arthritis, Systemic Lupus Erythematosus, Psoriasis, T1DM, Asthma, Colitis, Crohn’s disease, and Thyroid disease (Hashimoto’s thyroiditis). Each disease was examined using polygenic scoring models (PRSs) with variants numbering less than 1000 extracted from the PGS Catalog database. Among the analyzed ADs, several genes were recurrently identified across different diseases. Shared genes with the highest frequency of occurrence were determined, indicating potential key players in the genetic architecture of autoimmune diseases (Table 2).

2.2. Review of SNPs in Shared Genes

We then annotated 453 SNPs in the top shared genes to find the most significant alterations that potentially have a significant impact on AD development. As expected, most of the SNPs found were intronic variants (Table S6).
Besides missense alterations, which have strong evidence of their impact in autoimmune disease pathogenesis in the literature and databases, there were a significant number of other alterations, including missense mutations in the introns and exons. However, information about the majority of them is limited. They have not been mentioned in the scope of autoimmune disease investigations, or annotated in public databases. Despite their usage in PRS, the level of evidence (LOE) in RegulomeDB (https://regulomedb.org/regulome-search/, assessed on 15 July 2025) is lower than 1e for a large number of them.
In Table 3, we provide the alteration and genes for the missense variants we found. Among them, there are several variants that were included in PRSs for certain autoimmune diseases, but the literature analysis showed a wider range of association for such variants, such as the rs30187 polymorphism of ERAP1: according to PRS, this alteration is common in psoriasis, but it is also associated with ankylosing spondylitis [56]. Association of one polymorphism with several autoimmune diseases is the primary purpose of the clustering analysis we performed and is described below.
Analysis of SNPs in the TSBP1 gene revealed that the majority of them are intronic variants. Intronic SNPs typically do not alter the amino acid sequence of the protein, but may affect RNA splicing and gene expression levels, which can influence functionality.
However, among all of the identified SNPs in the TSBP1 gene, only a few have the potential to alter the protein’s structure. Among them, only 4 SNPs—rs3129941, rs560505, rs1265754, and rs9268384—are Missense Variants, meaning that they cause a substitution of one amino acid for another in the protein sequence. Such changes may have functional consequences, as they could affect the structure or function of the TSBP1 protein. These SNPs may be associated with changes in phenotype or behavior related to the function of the TSBP1 gene, including its role in regulating immune processes.

2.3. Effect of PRS Size on Biological Impact

Polygenic risk scores (PRSs) are constructed from GWAS data and include variants statistically associated with disease, often with additional filtering, such as linkage considerations. The number of variants in different PRSs for the same disease can vary widely. It was hypothesized that PRSs containing fewer SNPs would consist of variants with stronger individual significance. Consequently, if autoimmune diseases share a common genetic core, PRSs with a small number of SNPs should more accurately capture biologically relevant genes strongly associated with immune system function. However, our analyses revealed that this prediction was not validated.
We analyzed the fraction of common genes based on PRS size in pairwise comparison for each AD pair (Tables S7–S9). Surprisingly, we found that the average fraction of common genes decreased from 4.75% in PRS sizes less than 1000 variants to 3.36% in PRS sizes less than 250 variants. We also assumed that the best performing PRS contained the most significant general autoimmune and disease-specific genes. In order to determine the ratio of common genes across various PRS scores, we formed a PRS group based on their performance. In this group, we included the top-5 PRSs for each disease based on their AUC score. For this group, the average fraction of common genes was 3.3% (Table S10). To visualize the distribution of ADs based on shared genes, we performed tSNE clustering based on the previously described binary matrix with each PRS size.
Initially, we identified three clusters of ADs: T1D was clustered with Celiac disease (C1); SLE was grouped with RA and Thyroid disease (C2), and Crohn’s disease and Colitis formed the third cluster (C3) (Figure 1A). Despite several changes in Euclidean distance between some diseases during PRS size decrease, ADs that formed clusters on PRS size 1000 grouped together on other PRS sizes (Figure 1B,C). T1D on PRS250 is located closer to RA than to Celiac disease, but, in general, the neighborhood remains stable. Interestingly, psoriasis could potentially be grouped with SLE, RA, and T1D, however, for the top AUC score PRS group, the distance between Psoriasis and C3 dramatically increased (Figure 1C).
It is important to note that tSNE clustering based on common PRS genes (Figure 1A) were grouped logically. T1D and Celiac disease, which compose C1, have a common genetic mechanism of pathogenesis [74]. Their correlation can also be defined by the greater co-occurrence that was observed in children with a T1D family history (HR: 2.80), HLA-DR3/4 (HR: 1.94) and single-nucleotide polymorphism rs3184504 at SH2B3 (HR: 1.53).
C2 included RA and SLE, which are known as system immune disorders, affecting multiple systems and organs. These ADs were grouped with HT in our study, which has great differences in development and the organs that are affected. However, some studies have found that RA and HT share common features, such as alterations in CTLA-4, PTPN22, HLA-DRB, CD-40, and FOXP3 [75].
Crohn’s disease and ulcerative colitis (C3) are characterized by the enhanced recruitment and retention of effector macrophages, neutrophils, and T cells into the inflamed intestine, where they are activated and release proinflammatory cytokines [76]. The accumulation of effector cells in the inflamed intestine is a result of enhanced recruitment, as well as prolonged survival caused by decreased cellular apoptosis. Crohn’s disease is a predominantly T helper 1- and T helper 17-mediated process, while ulcerative colitis seems to be an atypical T helper 2 disorder. Despite this, Crohn’s disease and colitis clusterized together, regardless of the number of genes in the PRSs. Both of these diseases are inflammatory diseases of the bowel and they may have common mechanisms of pathology. This similarity can be explained by common genetic predisposers that were summarized in an article by Festen et al. [77].
It is interesting that the C1 cluster had the highest number of common genes across all PRS sizes (Table 4). We did not find any common genes across the three clusters that would reflect the absence of universal mechanisms of AD development. It is also important to note that the C2 cluster (RA, SLE, AIT) contained the smallest number of common genes. The number of shared genes in the PRS with the best GWAS performance in clusters 1 and 3 was comparable, unlike in cluster 2, where it was significantly lower. The full list of common genes in clusters and PRS sizes is presented in Supplementary Table S11.
To check whether cluster C2 had real similarity, we built heatmaps based on Jaccard metrics. This analysis showed that the C1 and C3 clusters were the most stable, despite the fact that in PRS size 1000, colitis had slightly better similarity with asthma or T1D. C2 showed an association between RA and SLE, but in PRS sizes 500 and 250, RA had significantly stronger similarity with T1D (as reflected in tSNE). It is worth noting that for SLE, the Jaccard metric showed the highest similarity with RA in all cases, despite RA having a higher similarity with T1D for PRS sizes below 1000. It is also of interest that AIT had good similarity with RA, less so with SLE, but on the top-5 AUC heatmap, AIT, RA, and SLE were clearly clustered together (Figures S1–S4). As such, we can conclude that PRS size has a partial impact on biological associations, primarily concerning generalized autoimmune diseases, with significantly less effect on tissue-specific diseases.
As our next step, we tried to find differences in the pathological mechanisms between identified clusters. For this purpose, we used the Metascape tool (https://metascape.org/gp, assessed on 26 December 2025) to describe pathways enriched by genes identified as common. Firstly, we annotated common genes across all PRS sizes. This analysis showed that C1 (T1D and CD) was characterized by the enrichment of pathways associated with lymphocyte differentiation and activation (i.e., MHC assembly), C2 (SLE, RA, AIT) was associated with the regulation of the ERK1/2 cascade (plays an essential role downstream of immune receptors to elicit inflammatory gene expression in response to infection and cell or tissue damage), while C3 (Crohn’s disease and Colitis) had a response to bacterium and metal ion transport. Such heterogeneity between clusters may indicate several different mechanisms of immune system overactivation associated with ADs (Figure 2A).
To expand the dataset and test the hypothesis on a wider list of genes, we used the maximum PRS dimension (up to 1000). Pathway enrichment showed preservation and expansion of the effects we found on a smaller set of genes (Figure 2B). We found a link to Gram-negative bacteria response in C3, which are indeed described as triggers of colitis and Crohn’s disease [78]. In addition, there has been research pointing towards mechanisms of innate immune dysfunction in the development of Crohn’s disease [79]. In C2, a connection with MAPK pathway regulation has been described, with MAPK dysregulation being a common characteristic of these diseases [80,81,82]. C1 showed the highest number of enriched pathways involving antigen processing, MHC presentation, intracellular activation, and cytokine synthesis. Thus, we can say that the genes from the smallest set (250) successfully demonstrated the general direction of the pathogenic process, and expanding the set to 1000 provides some details about its mechanisms.

3. Discussion

The development of autoimmune diseases involves a complex interplay of genetic predisposition, environmental triggers, and dysregulation of the immune system. In this study, we explored the potential involvement of several genes in the pathogenesis of autoimmune disorders. Our findings shed light on how variations or alterations in these genes may contribute to the susceptibility and progression of autoimmune conditions.
We speculate that while the specific SNPs associated with these genes may differ depending on the specific disease, the genes themselves remain common across multiple autoimmune diseases. Indeed, significant SNPs usually change gene function [83]. Thus, the malfunction of the same gene caused by various SNPs may cause various ADs. We believe that this finding could support the hypothesis of a shared genetic basis for diverse autoimmune diseases. Potentially, this knowledge can be used for developing a common autoimmune risk score for the early prediction of immune system disbalance independent of tissue-specific genetic factors. To validate this hypothesis, we investigated several autoimmune diseases and conducted an analysis of genes and SNPs included in polygenic risk score (PRS) models for each disease.
We identified a list of common genes for the chosen ADs, which may be crucial for AD development. The most common gene is TSBP1-AS1 (testis expressed basic protein 1 antisense RNA), which was associated with 8 out of 10 ADs. The functions of TSBP1-AS1 have not been fully understood, but this long non-coding RNA (lncRNA) exhibits high expression levels in cells of the immune system [84]. According to the GWAS Catalog (https://www.ebi.ac.uk/gwas/, assessed on 15 June 2025), it is genetically associated with numerous immune-related and dermatological diseases. Furthermore, transcripts of TSBP1-AS1 have the capability to interact with various molecules involved in gene regulation, including RNA-binding proteins (RBPs), messenger RNAs (mRNAs), and microRNAs (miRNAs) [85]. It is hypothesized that dysfunction of the TSBP1-AS1 gene may lead to alterations in the regulation of immune responses. For example, if TSBP1-AS1 expression is disrupted or if mutations in the gene alter its functionality, it could lead to improper activation or inhibition of immune cells, including T and B lymphocytes. Such changes could disrupt immune tolerance, which is a key aspect in the development of autoimmune diseases. Disruption of immune tolerance may lead to the activation of autoimmune cells directed against the body’s own tissues and organs. In addition, TSBP1-AS1 is located near the HLA locus. lncRNAs in the MHC region are increasingly recognized as regulators of nearby immune genes [86]. As an antisense transcript, TSBP1-AS1 may regulate TSBP1 expression via RNA interference, epigenetic modulation, or chromatin remodeling. These associations suggest that TSBP1-AS1 might act as an immune-related regulatory lncRNA, contributing to disease susceptibility by influencing immune-related gene expression.
Other previously mentioned genes also play a significant role in human immunology.
HLA-DQB1, HLA-DRA, HLA-DPB1, TAP2, ERAP1, BTNL2, and HCG20 play an important role in the antigen presentation process; they are either part of the MHC complex or part of the mechanism that regulates peptide transport and processing before presentation [87].
The notable representation of genes located within the MHC region among the most frequently shared loci warrants careful consideration. This region is distinguished by remarkably elevated levels of linkage disequilibrium and a substantial accumulation of immune-related genes. Consequently, it is expected that this region will contribute disproportionately to shared genetic signals within the context of autoimmune polygenic risk scores. Consequently, the recurrent appearance of genes located in or near the MHC locus such as TSBP1-AS1 and TNXB is likely driven, at least in part, by this regional genomic architecture.
It is important to note, however, that the shared signal is not restricted to MHC. Furthermore, an analysis of the data reveals the presence of several immune-relevant genes, including IL2RA, SH2B3, and IRF5, which are repeatedly observed across multiple scores. The presence of these non-MHC loci, in conjunction with the biologically coherent clustering of related autoimmune conditions, is consistent with a broader, genome-wide component to the shared genetic architecture. Taken together, these observations suggest that the identified common core reflects a combination of strong, regionally concentrated effects from MHC and a more distributed set of immunoregulatory pathways operating across the genome, rather than being solely attributable to linkage disequilibrium within a single genomic region.
IRF5, SH2B3, GSDMB, CSMD1, and CLEC16A are part of the autophagic regulation process and regulate inflammation [88]. IL2RA [89], BACH2 [90], and ZMIZ1 [91] support balance between immune activity and autoimmune processes. RBFOX1 [92] plays a role in neuroimmune processes.
Some genes, such as RBFOX1 [93] and PTCHD1-AS [94], are also associated with neural development and disorders. Upon initial observation, the identification of these genes, which are tentatively unrelated to the immune response, among genes associated with autoimmune diseases, appears to be paradoxical. However, there are several potential explanations for this phenomenon.
Firstly, pleiotropy is a well-established property of complex traits. Many loci contribute to multiple, seemingly unrelated, phenotypes by acting in fundamental cellular processes such as transcriptional regulation, RNA splicing, chromatin organization, or cell–cell communication. RBFOX1, although best known for its role in neuronal alternative splicing, regulates large splicing networks that are active across multiple tissues [95,96]. Perturbations in such global regulatory factors can influence immune cell differentiation, activation thresholds, or stress responses indirectly, thereby contributing to autoimmune susceptibility.
Secondly, neuroimmune and epithelial–immune crosstalk provides a plausible biological link. Autoimmune diseases frequently involve organs with dense neural innervation (gut, skin, and lung) and strong barrier functions. Genes implicated in neural development or synaptic organization may influence immune regulation via autonomic signaling, neuropeptide release, or modulation of epithelial integrity. In this context, loci initially discovered in neurodevelopmental disorders may reappear in autoimmune PRSs because they participate in shared signaling or regulatory pathways, rather than disease-specific immune mechanisms.
Thirdly, for non-coding loci such as PTCHD1-AS, the signal likely reflects regulatory effects, rather than gene-specific function. Long non-coding RNAs often act as cis-regulators of nearby genes or as components of broader chromatin domains. Variants mapped to PTCHD1-AS may tentatively tag regulatory regions that influence immune-relevant genes in a cell-type-specific manner t. This is particularly relevant in PRS construction, where intronic and intergenic SNPs with modest individual effects are aggregated.
HCG20 and LOC124901301 are lncRNA, the functions of which are mostly unknown, but they are localized in immune related parts of the DNA and may play a role in expression regulation [97].
Despite the active growth of GWAS studies, the comparison and application of developed risk scores remains a challenge. This is due to various factors, such as the heterogeneity of sequencing methods, development and testing on different populations, and different performance evaluation systems. In this article, we propose evaluating the preservation of biological meaning as a measure of the adequacy of the new PRS application, based on the genes included in it. To do this, we clustered autoimmune diseases based on the commonality of genes included in their risk scores. Our analysis showed that the biological relevance and uniform division of diseases into logical groups are preserved, regardless of the number of genes included in the risk score. It should be noted that similar mechanisms of pathogenesis and genetic predispositions have been described for the grouped diseases. This observation, which suggests a tendency to group autoimmune diseases into logical tissue- or pathogenetically specific clusters, was described in an article by Demela et al. [17]. However, when the PRS dimensionality was reduced, we found that the average percentage of common genes in a pairwise comparison of AD decreased. This may indicate that the broadest risk scores contain more genes that are directly related to the function of the immune system. Thus, for newly developed PRS, we propose evaluating their applicability and biological relevance through the analysis of the genes in their grouping, comparing them with other sets, namely those in the PGS catalog. Based on the size of the resulting PRS, we can estimate the average percentage of common autoimmune genes, assessing whether there are strong deviations towards more specific, or common immune-related genes. Additionally, the commonality of autoimmune genes can be evaluated. Their low number in PRSs should mean that, for some reason, GWAS did not identify the SNPs in these genes as significant, and it is worth rechecking the analysis.
In order to evaluate the functional relevance of the identified autoimmune disease clusters, an analysis was performed on the genes shared within each cluster across different PRS sizes, and a pathway enrichment analysis was conducted. Despite the minimal overlap of individual genes between clusters, genes consistently shared within a cluster formed functionally coherent groups enriched for immune-related processes. Distinct pathway signatures were observed across clusters. Of particular significance was the observation that these enrichment patterns remained consistent across PRS sizes, suggesting that incorporating additional variants primarily serves to reinforce existing biological pathways. The findings of this study indicate that the aggregation of autoimmune diseases is indicative of shared underlying biological mechanisms, which are captured by a limited set of core pathways. This observation contradicts the hypothesis that the aggregation is driven by PRS size or direct gene overlap between clusters.
Currently, there is an urgent need for the early diagnosis of autoimmune diseases in order to prevent them and calculate the therapeutic window for the use of pathogenetic therapy. The analysis of polymorphisms associated with ADs and their associated genes can serve as a useful tool for identifying the mechanisms of immune system function and its disorders, as well as a method for comparing the adequacy of the PRS being developed. In addition, the discovery of common genetic mechanisms for the development of different ADs or AD groups can serve as a basis for the development of new approaches for personalized pathogenetic therapy.

4. Materials and Methods

4.1. Identification of Autoimmune Diseases (ADs) and Data Collection

Ten autoimmune diseases (ADs) were selected for investigation based on their prevalence and availability of data in public sources. For each disease, all SNPs included in polygenic scoring models (PRSs) listed in the PGS Catalog (https://www.pgscatalog.org/browse/scores/, assessed on 1 June 2025) were collected, stratified by models containing fewer than 1000 (PRS 1000), 500 (PRS 500), and 250 (PRS 250) variants (Figure 3). It should be noted that these data were nested: 1000 included 500, which in turn included 250. Limiting the size of the PRS to no more than 1000 variants was necessary to effectively manage the size of the datasets, limiting them to an acceptable number of genetic variants. To search the PGS Catalog, we utilized the name of the autoimmune disease as the key keyword.
Furthermore, we defined a group of the five best-performing PRSs for AD. Selection was based on the AUC metric available in the PGS Catalog, prioritizing scores validated within their original GWAS target population.
For each of the ten selected ADs, lists of single nucleotide polymorphisms (SNPs) included in the PGS were compiled. Subsequently, the SNP lists were merged by traits, and the corresponding genes for each rsID were identified using databases, such as dbSNP and NCBI Gene. From the PGS catalog, we obtained a total of 209 files each containing a list of SNPs of the certain chosen PRS. For the sets of PRSs with up to 1000 SNPs, a total of 58 files were available. Rheumatoid arthritis had the highest number of files (17). Overall, there were 9911 unique SNPs across all files. To work with the table, we utilized the Python programming language (v. 3.0) along with the pandas library.

4.2. Preparation of Gene-SNP Correspondence Table for Each AD

Then, a table was compiled, indicating the SNPs (rsID) from all chosen PRSs and their weights for each position as well as the corresponding gene and disease for each chosen PRS. The corresponding gene was identified through automated annotation using the NCBI dbSNP database. Specifically, a custom Python workflow was implemented to ensure reproducibility and consistency across all PRSs. Summary statistics files in .txt format were parsed to extract the rsID column. For each variant, the script queried dbSNP via the Entrez Direct utilities (esearch, esummary, and xtract) to retrieve associated gene symbols (field: GENE_E/NAME). Gene names were deduplicated and appended to the corresponding rsID entry, and the resulting tables were consolidated into a final reference file (Table S1).

4.3. Analysis of Intersections and Common Genes

To assess the genetic similarities among different autoimmune diseases, a binary matrix (Gene-AD) was constructed for summarized data for each of the 3 PRS sizes (1000, 500, and 250). Presence of the variant in a gene for a certain AD was defined as 1, absence was defined as 0 (Table S2). Then, for each PRS size group and for top-performing PRSs, we calculated the number and fraction of common and individual genes for each pair in the chosen AD list (Tables S3–S5).

4.4. Identification of Genes in Which SNPs Are Associated with Multiple AD Pathogenesis

To identify the genes in which SNPs are associated with multiple AD pathogenesis, frequencies of gene and SNP occurrence in the dataset were calculated. The topmost frequently encountered genes were selected for further analysis. First, all missense variants were analyzed for pathogenicity using ClinVar annotations [98] and Regulome DB (https://regulomedb.org) [99]. Then, all others variants were analyzed: variants with regulome scores higher than 1f (eQTL/caQTL + TF binding/chromatin accessibility peak, per Regulome DB), which is the most common score for all analyzed variants, were taken to further analysis, including a literature check, to determine whether these variants could be associated with immune diseases, and how they can affect protein function (Table S6). Most SNPs are not annotated in ClinVar and have no pathogenicity status, so previous research was the most efficient way to find reliable SNPs in PRS.

4.5. Data Processing and Clustering

For data analysis, we used Python version 3.0 and the following packages: pandas, numpy, scikit-learn, matplotlib, and seaborn. Dimensionality reduction and visualization of disease similarity were performed using t-distributed stochastic neighbor embedding (t-SNE). The analysis was based on a binary matrix representing the presence or absence of genes across immune-mediated diseases. t-SNE was applied to the transposed matrix, treating diseases as data points and genes as binary features.
We used Jaccard distance as the similarity metric and set perplexity to 3 to account for the small number of samples (diseases). The resulting two-dimensional embeddings were visualized using the Python v.3.0 (the Seaborn and Matplotlib libraries), with each disease labeled and color-coded to highlight potential clustering patterns.

5. Conclusions

The current paradigm of genetic predisposition to autoimmune diseases is gradually shifting from a primary focus on polymorphisms within immune genes and their proximal regulatory regions toward the identification of intergenic variants with potentially strong effects [100,101]. Our study shows that the saturation of PRS with a large number of detected signals does not affect the biological significance and has a limited impact on the performance of the risk scale. From this perspective, a more promising strategy may be to prioritize the identification of the most informative variants that are either common to the development of autoimmune mechanisms, or specific to a particular autoimmune disease.
However, identifying the shared genetic features underlying ADs that cause this particular clustering may enable the development of a generalized framework for estimating the long-term risk of developing ADs. Such an approach could further contribute to predicting which clinical manifestations are more likely to arise in a given individual by linking genetic profiles to specific autoimmune clusters, thereby providing a biologically grounded basis for stratification beyond disease-specific risk scores.
While the findings suggest that biological interpretability remains consistent across PRS sizes, this consistency does not equate to universal equivalence in application. The results of the present study emphasize a critical, context-dependent distinction between the extraction of shared biological meaning and the achievement of maximal predictive utility. Compact, targeted PRSs, enriched for core pathways, are likely sufficient for mechanistic insight and population stratification. Conversely, larger PRSs, which aggregate a more comprehensive spectrum of polygenic variance, are still expected to be superior for individual-level risk prediction where accuracy is critical. Consequently, the optimal PRS design should be guided by its primary objective.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms27010543/s1.

Author Contributions

Conceptualization, D.V.T. and A.A.D.; methodology, A.A.D., D.V.T. and V.S.S.; software, V.S.S.; formal analysis, V.S.S. and N.A.B.; investigation, D.V.T. and N.A.B.; resources, V.P.B., A.V.K., D.V.S., E.S.P., V.S.Y., V.S.Y., V.S.S., M.W. and A.V.K.; data curation, V.S.S. and A.A.D.; writing—original draft preparation, N.A.B., D.V.T. and A.A.D.; writing—review and editing, A.A.M., V.P.B., J.A.K., A.A.K., E.S.P., D.V.S., O.V.G., D.I.T. and A.V.K.; visualization, D.V.T. and V.S.S.; supervision, V.P.B.; project administration, D.V.T., E.S.P. and D.V.S.; funding acquisition, V.P.B., V.S.S., S.M.Y., V.I.S., V.S.Y. and P.Y.V. All authors have read and agreed to the published version of the manuscript.

Funding

This study was sponsored by grants from ANO “Moscow Center for Innovative Healthcare Technologies” (research project no. 1309-4/25) and by the Ministry of Science and Higher Education of the Russian Federation (agreement FGFG-2025-0015).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available in the PGS Catalog at https://www.pgscatalog.org. These data were derived from the following resources available in the public domain: https://www.pgscatalog.org.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Pisetsky, D.S. Pathogenesis of Autoimmune Disease. Nat. Rev. Nephrol. 2023, 19, 509–524. [Google Scholar] [CrossRef]
  2. Angum, F.; Khan, T.; Kaler, J.; Siddiqui, L.; Hussain, A. The Prevalence of Autoimmune Disorders in Women: A Narrative Review. Cureus 2020, 12, e8094. [Google Scholar] [CrossRef] [PubMed]
  3. Su, M.A.; Anderson, M.S. Monogenic Autoimmune Diseases: Insights into Self-Tolerance. Pediatr. Res. 2009, 65, 20R–25R. [Google Scholar] [CrossRef]
  4. Doria, A.; Zen, M.; Bettio, S.; Gatto, M.; Bassi, N.; Nalotto, L.; Ghirardello, A.; Iaccarino, L.; Punzi, L. Autoinflammation and Autoimmunity: Bridging the Divide. Autoimmun. Rev. 2012, 12, 22–30. [Google Scholar] [CrossRef] [PubMed]
  5. Zen, M.; Gatto, M.; Domeneghetti, M.; Palma, L.; Borella, E.; Iaccarino, L.; Punzi, L.; Doria, A. Clinical Guidelines and Definitions of Autoinflammatory Diseases: Contrasts and Comparisons with Autoimmunity—A Comprehensive Review. Clin. Rev. Allerg. Immunol. 2013, 45, 227–235. [Google Scholar] [CrossRef] [PubMed]
  6. Wang, L.; Wang, F.; Gershwin, M.E. Human Autoimmune Diseases: A Comprehensive Update. J. Intern. Med. 2015, 278, 369–395. [Google Scholar] [CrossRef]
  7. Toro-Domínguez, D.; Carmona-Sáez, P.; Alarcón-Riquelme, M.E. Shared Signatures between Rheumatoid Arthritis, Systemic Lupus Erythematosus and Sjögren’s Syndrome Uncovered through Gene Expression Meta-Analysis. Arthritis Res. Ther. 2014, 16, 489. [Google Scholar] [CrossRef]
  8. Conrad, N.; Misra, S.; Verbakel, J.Y.; Verbeke, G.; Molenberghs, G.; Taylor, P.N.; Mason, J.; Sattar, N.; McMurray, J.J.V.; McInnes, I.B.; et al. Incidence, Prevalence, and Co-Occurrence of Autoimmune Disorders over Time and by Age, Sex, and Socioeconomic Status: A Population-Based Cohort Study of 22 Million Individuals in the UK. Lancet 2023, 401, 1878–1890. [Google Scholar] [CrossRef]
  9. Popoviciu, M.S.; Kaka, N.; Sethi, Y.; Patel, N.; Chopra, H.; Cavalu, S. Type 1 Diabetes Mellitus and Autoimmune Diseases: A Critical Review of the Association and the Application of Personalized Medicine. JPM 2023, 13, 422. [Google Scholar] [CrossRef]
  10. Gokuladhas, S.; Schierding, W.; Golovina, E.; Fadason, T.; O’Sullivan, J. Unravelling the Shared Genetic Mechanisms Underlying 18 Autoimmune Diseases Using a Systems Approach. Front. Immunol. 2021, 12, 693142. [Google Scholar] [CrossRef]
  11. Caliskan, M.; Brown, C.D.; Maranville, J.C. A Catalog of GWAS Fine-Mapping Efforts in Autoimmune Disease. Am. J. Hum. Genet. 2021, 108, 549–563. [Google Scholar] [CrossRef] [PubMed]
  12. Robino, A.; Bevilacqua, E.; Aldegheri, L.; Conti, A.; Bazzo, V.; Tornese, G.; Catamo, E. Next-Generation Sequencing Reveals Additional HLA Class I and Class II Alleles Associated with Type 1 Diabetes and Age at Onset. Front. Immunol. 2024, 15, 1427349. [Google Scholar] [CrossRef] [PubMed]
  13. The Type 1 Diabetes Genetics Consortium; Barrett, J.C.; Clayton, D.G.; Concannon, P.; Akolkar, B.; Cooper, J.D.; Erlich, H.A.; Julier, C.; Morahan, G.; Nerup, J.; et al. Genome-Wide Association Study and Meta-Analysis Find That over 40 Loci Affect Risk of Type 1 Diabetes. Nat. Genet. 2009, 41, 703–707. [Google Scholar] [CrossRef]
  14. Ma, Q.; Shams, H.; Didonna, A.; Baranzini, S.E.; Cree, B.A.C.; Hauser, S.L.; Henry, R.G.; Oksenberg, J.R. Integration of Epigenetic and Genetic Profiles Identifies Multiple Sclerosis Disease-Critical Cell Types and Genes. Commun. Biol. 2023, 6, 342. [Google Scholar] [CrossRef]
  15. Zajec, A.; Trebušak Podkrajšek, K.; Tesovnik, T.; Šket, R.; Čugalj Kern, B.; Jenko Bizjan, B.; Šmigoc Schweiger, D.; Battelino, T.; Kovač, J. Pathogenesis of Type 1 Diabetes: Established Facts and New Insights. Genes. 2022, 13, 706. [Google Scholar] [CrossRef]
  16. Sharp, S.A.; Rich, S.S.; Wood, A.R.; Jones, S.E.; Beaumont, R.N.; Harrison, J.W.; Schneider, D.A.; Locke, J.M.; Tyrrell, J.; Weedon, M.N.; et al. Development and Standardization of an Improved Type 1 Diabetes Genetic Risk Score for Use in Newborn Screening and Incident Diagnosis. Diabetes Care 2019, 42, 200–207. [Google Scholar] [CrossRef]
  17. Demela, P.; Pirastu, N.; Soskic, B. Cross-Disorder Genetic Analysis of Immune Diseases Reveals Distinct Gene Associations That Converge on Common Pathways. Nat. Commun. 2023, 14, 2743. [Google Scholar] [CrossRef]
  18. Yarwood, A.; Han, B.; Raychaudhuri, S.; Bowes, J.; Lunt, M.; Pappas, D.A.; Kremer, J.; Greenberg, J.D.; Plenge, R.; Worthington, J.; et al. A Weighted Genetic Risk Score Using All Known Susceptibility Variants to Estimate Rheumatoid Arthritis Risk. Ann. Rheum. Dis. 2015, 74, 170–176. [Google Scholar] [CrossRef]
  19. Chen, Y.-C.; Liu, T.-Y.; Lu, H.-F.; Huang, C.-M.; Liao, C.-C.; Tsai, F.-J. Multiple Polygenic Risk Scores Can Improve the Prediction of Systemic Lupus Erythematosus in Taiwan. Lupus Sci. Med. 2024, 11, e001035. [Google Scholar] [CrossRef]
  20. Choi, S.W.; Mak, T.S.-H.; O’Reilly, P.F. Tutorial: A Guide to Performing Polygenic Risk Score Analyses. Nat. Protoc. 2020, 15, 2759–2772. [Google Scholar] [CrossRef] [PubMed]
  21. Zhao, Z.; Gruenloh, T.; Yan, M.; Wu, Y.; Sun, Z.; Miao, J.; Wu, Y.; Song, J.; Lu, Q. Optimizing and Benchmarking Polygenic Risk Scores with GWAS Summary Statistics. Genome Biol. 2024, 25, 260. [Google Scholar] [CrossRef] [PubMed]
  22. Marees, A.T.; de Kluiver, H.; Stringer, S.; Vorspan, F.; Curis, E.; Marie-Claire, C.; Derks, E.M. A Tutorial on Conducting Genome-Wide Association Studies: Quality Control and Statistical Analysis. Int. J. Methods Psychiatr. Res. 2018, 27, e1608. [Google Scholar] [CrossRef] [PubMed]
  23. Privé, F.; Aschard, H.; Carmi, S.; Folkersen, L.; Hoggart, C.; O’Reilly, P.F.; Vilhjálmsson, B.J. Portability of 245 Polygenic Scores When Derived from the UK Biobank and Applied to 9 Ancestry Groups from the Same Cohort. Am. J. Hum. Genet. 2022, 109, 12–23. [Google Scholar] [CrossRef] [PubMed]
  24. Mobasseri, M.; Shirmohammadi, M.; Amiri, T.; Vahed, N.; Hosseini Fard, H.; Ghojazadeh, M. Prevalence and Incidence of Type 1 Diabetes in the World: A Systematic Review and Meta-Analysis. Health Promot. Perspect. 2020, 10, 98–115, Correction in Health Promot. Perspect. 2020, 14, 202–205. https://doi.org/10.34172/hpp.2020.18. [Google Scholar]
  25. Feuerstein, J.D.; Cheifetz, A.S. Crohn Disease: Epidemiology, Diagnosis, and Management. Mayo Clin. Proc. 2017, 92, 1088–1103. [Google Scholar] [CrossRef]
  26. Torres, J.; Mehandru, S.; Colombel, J.-F.; Peyrin-Biroulet, L. Crohn’s Disease. Lancet 2017, 389, 1741–1755. [Google Scholar] [CrossRef]
  27. Ordás, I.; Eckmann, L.; Talamini, M.; Baumgart, D.C.; Sandborn, W.J. Ulcerative Colitis. Lancet 2012, 380, 1606–1619. [Google Scholar] [CrossRef] [PubMed]
  28. Fortuna, G.; Brennan, M.T. Systemic Lupus Erythematosus: Epidemiology, Pathophysiology, Manifestations, and Management. Dent. Clin. North. Am. 2013, 57, 631–655. [Google Scholar] [CrossRef]
  29. McGrogan, A.; Seaman, H.E.; Wright, J.W.; de Vries, C.S. The Incidence of Autoimmune Thyroid Disease: A Systematic Review of the Literature. Clin. Endocrinol. 2008, 69, 687–696. [Google Scholar] [CrossRef]
  30. Hu, X.; Chen, Y.; Shen, Y.; Tian, R.; Sheng, Y.; Que, H. Global Prevalence and Epidemiological Trends of Hashimoto’s Thyroiditis in Adults: A Systematic Review and Meta-Analysis. Front. Public. Health 2022, 10, 1020709. [Google Scholar] [CrossRef]
  31. Icen, M.; Crowson, C.S.; McEvoy, M.T.; Dann, F.J.; Gabriel, S.E.; Maradit Kremers, H. Trends in Incidence of Adult-Onset Psoriasis over Three Decades: A Population-Based Study. J. Am. Acad. Dermatol. 2009, 60, 394–401. [Google Scholar] [CrossRef] [PubMed]
  32. Parisi, R.; Iskandar, I.Y.K.; Kontopantelis, E.; Augustin, M.; Griffiths, C.E.M.; Ashcroft, D.M. Global Psoriasis Atlas National, Regional, and Worldwide Epidemiology of Psoriasis: Systematic Analysis and Modelling Study. BMJ 2020, 369, m1590. [Google Scholar] [CrossRef]
  33. GBD 2021 Rheumatoid Arthritis Collaborators. Global, Regional, and National Burden of Rheumatoid Arthritis, 1990–2020, and Projections to 2050: A Systematic Analysis of the Global Burden of Disease Study 2021. Lancet Rheumatol. 2023, 5, e594–e610. [Google Scholar] [CrossRef] [PubMed]
  34. Cao, Y.; Chen, S.; Chen, X.; Zou, W.; Liu, Z.; Wu, Y.; Hu, S. Global Trends in the Incidence and Mortality of Asthma from 1990 to 2019: An Age-Period-Cohort Analysis Using the Global Burden of Disease Study 2019. Front Public Health 2022, 10, 1036674. [Google Scholar] [CrossRef] [PubMed]
  35. Song, P.; Adeloye, D.; Salim, H.; Dos Santos, J.P.; Campbell, H.; Sheikh, A.; Rudan, I. Global, Regional, and National Prevalence of Asthma in 2019: A Systematic Analysis and Modelling Study. J Glob Health 2022, 12, 04052. [Google Scholar] [CrossRef]
  36. King, J.A.; Jeong, J.; Underwood, F.E.; Quan, J.; Panaccione, N.; Windsor, J.W.; Coward, S.; deBruyn, J.; Ronksley, P.E.; Shaheen, A.-A.; et al. Incidence of Celiac Disease Is Increasing Over Time: A Systematic Review and Meta-Analysis. Am. J. Gastroenterol. 2020, 115, 507–525. [Google Scholar] [CrossRef]
  37. Singh, P.; Arora, A.; Strand, T.A.; Leffler, D.A.; Catassi, C.; Green, P.H.; Kelly, C.P.; Ahuja, V.; Makharia, G.K. Global Prevalence of Celiac Disease: Systematic Review and Meta-Analysis. Clin. Gastroenterol. Hepatol. 2018, 16, 823–836.e2. [Google Scholar] [CrossRef]
  38. Walton, C.; King, R.; Rechtman, L.; Kaye, W.; Leray, E.; Marrie, R.A.; Robertson, N.; La Rocca, N.; Uitdehaag, B.; van der Mei, I.; et al. Rising Prevalence of Multiple Sclerosis Worldwide: Insights from the Atlas of MS, Third Edition. Mult. Scler. 2020, 26, 1816–1821. [Google Scholar] [CrossRef]
  39. Liu, T.; Hecker, J.; Liu, S.; Rui, X.; Boyer, N.; Wang, J.; Yu, Y.; Zhang, Y.; Mou, H.; Gomez-Escobar, L.G.; et al. The Asthma Risk Gene, GSDMB, Promotes Mitochondrial DNA-Induced ISGs Expression. J. Respir. Biol. Transl. Med. 2024, 1, 10005. [Google Scholar] [CrossRef]
  40. Kaufman, C.S.; Butler, M.G. Mutation in TNXB Gene Causes Moderate to Severe Ehlers-Danlos Syndrome. World J. Med. Genet. 2016, 6, 17–21. [Google Scholar] [CrossRef]
  41. Buhelt, S.; Laigaard, H.-M.; von Essen, M.R.; Ullum, H.; Oturai, A.; Sellebjerg, F.; Søndergaard, H.B. IL2RA Methylation and Gene Expression in Relation to the Multiple Sclerosis-Associated Gene Variant Rs2104286 and Soluble IL-2Rα in CD8+ T Cells. Front. Immunol. 2021, 12, 676141. [Google Scholar] [CrossRef]
  42. Ban, T.; Kikuchi, M.; Sato, G.R.; Manabe, A.; Tagata, N.; Harita, K.; Nishiyama, A.; Nishimura, K.; Yoshimi, R.; Kirino, Y.; et al. Genetic and Chemical Inhibition of IRF5 Suppresses Pre-Existing Mouse Lupus-like Disease. Nat. Commun. 2021, 12, 4379. [Google Scholar] [CrossRef]
  43. Martinelli-Boneschi, F.; Esposito, F.; Brambilla, P.; Lindström, E.; Lavorgna, G.; Stankovich, J.; Rodegher, M.; Capra, R.; Ghezzi, A.; Coniglio, G.; et al. A Genome-Wide Association Study in Progressive Multiple Sclerosis. Mult. Scler. 2012, 18, 1384–1394. [Google Scholar] [CrossRef]
  44. Bahram, S.; Arnold, D.; Bresnahan, M.; Strominger, J.L.; Spies, T. Two Putative Subunits of a Peptide Pump Encoded in the Human Major Histocompatibility Complex Class II Region. Proc. Natl. Acad. Sci. USA 1991, 88, 10094–10098. [Google Scholar] [CrossRef]
  45. Cildir, G.; Tumes, D.J. DOT1L Leaves Its Mark on Adaptive Immunity. Immunol. Cell Biol. 2021, 99, 348–350. [Google Scholar] [CrossRef] [PubMed]
  46. Geranton, S.; Rostagnat-Stefanutti, A.; Bendelac, N.; Cerrato, E.; Barbalat, V.; Leissner, P.; Nicolino, M.; Thivolet, C.; Mougin, B. High-Risk Genotype for Type 1 Diabetes: A New Simple Microtiter Plate-Based ELOSA Assay. Genet. Test. 2003, 7, 7–12. [Google Scholar] [CrossRef] [PubMed]
  47. O’Leary, A.; Fernàndez-Castillo, N.; Gan, G.; Yang, Y.; Yotova, A.Y.; Kranz, T.M.; Grünewald, L.; Freudenberg, F.; Antón-Galindo, E.; Cabana-Domínguez, J.; et al. Behavioural and Functional Evidence Revealing the Role of RBFOX1 Variation in Multiple Psychiatric Disorders and Traits. Mol. Psychiatry 2022, 27, 4464–4473. [Google Scholar] [CrossRef]
  48. Kraus, D.M.; Elliott, G.S.; Chute, H.; Horan, T.; Pfenninger, K.H.; Sanford, S.D.; Foster, S.; Scully, S.; Welcher, A.A.; Holers, V.M. CSMD1 Is a Novel Multiple Domain Complement-Regulatory Protein Highly Expressed in the Central Nervous System and Epithelial Tissues. J. Immunol. 2006, 176, 4419–4430. [Google Scholar] [CrossRef] [PubMed]
  49. Baum, M.L.; Wilton, D.K.; Fox, R.G.; Carey, A.; Hsu, Y.-H.H.; Hu, R.; Jäntti, H.J.; Fahey, J.B.; Muthukumar, A.K.; Salla, N.; et al. CSMD1 Regulates Brain Complement Activity and Circuit Development. Brain Behav. Immun. 2024, 119, 317–332. [Google Scholar] [CrossRef]
  50. Arnett, H.A.; Escobar, S.S.; Gonzalez-Suarez, E.; Budelsky, A.L.; Steffen, L.A.; Boiani, N.; Zhang, M.; Siu, G.; Brewer, A.W.; Viney, J.L. BTNL2, a Butyrophilin/B7-like Molecule, Is a Negative Costimulatory Molecule Modulated in Intestinal Inflammation. J. Immunol. 2007, 178, 1523–1533. [Google Scholar] [CrossRef]
  51. Admon, A. ERAP1 Shapes Just Part of the Immunopeptidome. Hum. Immunol. 2019, 80, 296–301. [Google Scholar] [CrossRef]
  52. Saad, M.A.; Abdul-Sattar, A.B.; Abdelal, I.T.; Baraka, A. Shedding Light on the Role of ERAP1 in Axial Spondyloarthritis. Cureus 2023, 15, e48806. [Google Scholar] [CrossRef]
  53. Liu, H.; Wang, S.; Cao, B.; Zhu, J.; Huang, Z.; Li, P.; Zhang, S.; Liu, X.; Yu, J.; Huang, Z.; et al. Unraveling Genetic Risk Contributions to Nonverbal Status in Autism Spectrum Disorder Probands. Behav. Brain Funct. 2025, 21, 15. [Google Scholar] [CrossRef] [PubMed]
  54. Ben Khalaf, N.; Taha, S.; Bakhiet, M.; Fathallah, M.D. A Central Nervous System-Dependent Intron-Embedded Gene Encodes a Novel Murine Fyn Binding Protein. PLoS ONE 2016, 11, e0149612. [Google Scholar] [CrossRef] [PubMed][Green Version]
  55. Pandey, R.; Bakay, M.; Hakonarson, H. CLEC16A-An Emerging Master Regulator of Autoimmunity and Neurodegeneration. Int. J. Mol. Sci. 2023, 24, 8224. [Google Scholar] [CrossRef]
  56. Gao, S.; Xu, T.; Liang, W.; Xun, C.; Deng, Q.; Guo, H.; Sheng, W. Association of Rs27044 and Rs30187 Polymorphisms in Endoplasmic Reticulum Aminopeptidase 1 Gene and Ankylosing Spondylitis Susceptibility: A Meta-Analysis. Int. J. Rheum. Dis. 2020, 23, 499–510. [Google Scholar] [CrossRef]
  57. Sun, W.; Min, H.; Zhao, L. Association of BTNL2 Single Nucleotide Polymorphisms with Knee Osteoarthritis Susceptibility. Int. J. Clin. Exp. Pathol. 2019, 12, 3921–3927. [Google Scholar]
  58. Lin, Y.; Wei, J.; Fan, L.; Cheng, D. BTNL2 Gene Polymorphism and Sarcoidosis Susceptibility: A Meta-Analysis. PLoS ONE 2015, 10, e0122639. [Google Scholar] [CrossRef] [PubMed]
  59. Jiang, R.; Dong, J.; Dai, Y. Genome-Wide Association Study of Rheumatoid Arthritis by a Score Test Based on Wavelet Transformation. BMC Proc. 2009, 3, S8. [Google Scholar] [CrossRef]
  60. Zavattaro, E.; Ramezani, M.; Sadeghi, M. Endoplasmic Reticulum Aminopeptidase 1 (ERAP1) Polymorphisms and Psoriasis Susceptibility: A Systematic Review and Meta-Analysis. Gene 2020, 736, 144416. [Google Scholar] [CrossRef]
  61. Tsui, F.W.L.; Haroon, N.; Reveille, J.D.; Rahman, P.; Chiu, B.; Tsui, H.W.; Inman, R.D. Association of an ERAP1 ERAP2 Haplotype with Familial Ankylosing Spondylitis. Ann. Rheum. Dis. 2010, 69, 733–736. [Google Scholar] [CrossRef]
  62. Das, A.; Chandra, A.; Chakraborty, J.; Chattopadhyay, A.; Senapati, S.; Chatterjee, G.; Chatterjee, R. Associations of ERAP1 Coding Variants and Domain Specific Interaction with HLA-C∗06 in the Early Onset Psoriasis Patients of India. Hum. Immunol. 2017, 78, 724–730. [Google Scholar] [CrossRef]
  63. Wu, X.; Zhao, Z. Associations between ERAP1 Gene Polymorphisms and Psoriasis Susceptibility: A Meta-Analysis of Case-Control Studies. Biomed. Res. Int. 2021, 2021, 5515868. [Google Scholar] [CrossRef] [PubMed]
  64. Qin, Z.-M.; Liang, S.-Q.; Long, J.-X.; Deng, J.-M.; Wei, X.; Yang, M.-L.; Tang, S.-J.; Li, H.-L. Importance of GWAS Risk Loci and Clinical Data in Predicting Asthma Using Machine-Learning Approaches. Comb. Chem. High. Throughput Screen. 2024, 27, 400–407. [Google Scholar] [CrossRef]
  65. Shamsi, B.H.; Chen, H.; Yang, X.; Liu, M.; Liu, Y. Association between Polymorphisms of the GSDMB Gene and Allergic Rhinitis Risk in the Chinese Population: A Case-Control Study. J. Asthma 2023, 60, 1751–1760. [Google Scholar] [CrossRef] [PubMed]
  66. Song, G.G.; Choi, S.J.; Ji, J.D.; Lee, Y.H. Genome-Wide Pathway Analysis of a Genome-Wide Association Study on Multiple Sclerosis. Mol. Biol. Rep. 2013, 40, 2557–2564. [Google Scholar] [CrossRef]
  67. Wang, Z.-X.; Wang, H.-F.; Tan, L.; Liu, J.; Wan, Y.; Sun, F.-R.; Tan, M.-S.; Tan, C.-C.; Jiang, T.; Tan, L.; et al. Effects of HLA-DRB1/DQB1 Genetic Variants on Neuroimaging in Healthy, Mild Cognitive Impairment, and Alzheimer’s Disease Cohorts. Mol. Neurobiol. 2017, 54, 3181–3188. [Google Scholar] [CrossRef]
  68. Chiba, H.; Kakuta, Y.; Kinouchi, Y.; Kawai, Y.; Watanabe, K.; Nagao, M.; Naito, T.; Onodera, M.; Moroi, R.; Kuroha, M.; et al. Allele-Specific DNA Methylation of Disease Susceptibility Genes in Japanese Patients with Inflammatory Bowel Disease. PLoS ONE 2018, 13, e0194036, Correction in PLoS ONE 2019, 14, e0212148. https://doi.org/10.1371/journal.pone.0194036. [Google Scholar]
  69. Lee, Y.H.; Choi, S.J.; Ji, J.D.; Song, G.G. Genome-Wide Pathway Analysis of a Genome-Wide Association Study on Psoriasis and Behcet’s Disease. Mol. Biol. Rep. 2012, 39, 5953–5959. [Google Scholar] [CrossRef]
  70. Matern, B.M.; Olieslagers, T.I.; Voorter, C.E.M.; Groeneweg, M.; Tilanus, M.G.J. Insights into the Polymorphism in HLA-DRA and Its Evolutionary Relationship with HLA Haplotypes. HLA 2020, 95, 117–127. [Google Scholar] [CrossRef] [PubMed]
  71. Guo, C.C.; Huang, W.H.; Zhang, N.; Dong, F.; Jing, L.P.; Liu, Y.; Ye, X.G.; Xiao, D.; Ou, M.L.; Zhang, B.H.; et al. Association between IL2/IL21 and SH2B3 Polymorphisms and Risk of Celiac Disease: A Meta-Analysis. Genet. Mol. Res. 2015, 14, 13221–13235. [Google Scholar] [CrossRef]
  72. Allenspach, E.J.; Shubin, N.J.; Cerosaletti, K.; Mikacenic, C.; Gorman, J.A.; MacQuivey, M.A.; Rosen, A.B.I.; Timms, A.E.; Wray-Dutra, M.N.; Niino, K.; et al. The Autoimmune Risk R262W Variant of the Adaptor SH2B3 Improves Survival in Sepsis. J. Immunol. 2021, 207, 2710–2719. [Google Scholar] [CrossRef]
  73. Lee, Y.H.; Bae, S.-C.; Choi, S.J.; Ji, J.D.; Song, G.G. Genome-Wide Pathway Analysis of Genome-Wide Association Studies on Systemic Lupus Erythematosus and Rheumatoid Arthritis. Mol. Biol. Rep. 2012, 39, 10627–10635. [Google Scholar] [CrossRef]
  74. Hagopian, W.; Lee, H.S.; Liu, E.; Rewers, M.; She, J.X.; Ziegler, A.G.; Lernmark, Å.; Toppari, J.; Rich, S.S.; Krischer, J.P.; et al. Co-occurrence of Type 1 Diabetes and Celiac Disease Autoimmunity. Pediatrics 2017, 140, e20171305. [Google Scholar] [CrossRef]
  75. Lichtiger, A.; Fadaei, G.; Tagoe, C.E. Autoimmune Thyroid Disease and Rheumatoid Arthritis: Where the Twain Meet. Clin. Rheumatol. 2024, 43, 895–905. [Google Scholar] [CrossRef]
  76. Sartor, R.B. Mechanisms of Disease: Pathogenesis of Crohn’s Disease and Ulcerative Colitis. Nat. Rev. Gastroenterol. Hepatol. 2006, 3, 390–407. [Google Scholar] [CrossRef]
  77. Festen, E.A.M.; Weersma, R.K. How Will Insights from Genetics Translate to Clinical Practice in Inflammatory Bowel Disease? Best. Pract. Res. Clin. Gastroenterol. 2014, 28, 387–397. [Google Scholar] [CrossRef]
  78. Loh, G.; Blaut, M. Role of Commensal Gut Bacteria in Inflammatory Bowel Diseases. Gut Microbes 2012, 3, 544–555. [Google Scholar] [CrossRef]
  79. Alula, K.M.; Theiss, A.L. Autophagy in Crohn’s Disease: Converging on Dysfunctional Innate Immunity. Cells 2023, 12, 1779. [Google Scholar] [CrossRef]
  80. Xie, J.; Sun, S.; Li, Q.; Chen, Y.; Huang, L.; Wang, D.; Wang, Y. MAPK/ERK Signaling Pathway in Rheumatoid Arthritis: Mechanisms and Therapeutic Potential. PeerJ 2025, 13, e19708. [Google Scholar] [CrossRef]
  81. Mavropoulos, A.; Orfanidou, T.; Liaskos, C.; Smyk, D.S.; Billinis, C.; Blank, M.; Rigopoulou, E.I.; Bogdanos, D.P. P38 Mitogen-Activated Protein Kinase (P38 MAPK)-Mediated Autoimmunity: Lessons to Learn from ANCA Vasculitis and Pemphigus Vulgaris. Autoimmun. Rev. 2013, 12, 580–590. [Google Scholar] [CrossRef]
  82. Kyritsi, E.M.; Kanaka-Gantenbein, C. Autoimmune Thyroid Disease in Specific Genetic Syndromes in Childhood and Adolescence. Front. Endocrinol. 2020, 11, 543. [Google Scholar] [CrossRef]
  83. Antontseva, E.V.; Degtyareva, A.O.; Korbolina, E.E.; Damarov, I.S.; Merkulova, T.I. Human-Genome Single Nucleotide Polymorphisms Affecting Transcription Factor Binding and Their Role in Pathogenesis. Vestn. VOGiS 2023, 27, 662–675. [Google Scholar] [CrossRef]
  84. Papatheodorou, I.; Moreno, P.; Manning, J.; Fuentes, A.M.-P.; George, N.; Fexova, S.; Fonseca, N.A.; Füllgrabe, A.; Green, M.; Huang, N.; et al. Expression Atlas Update: From Tissues to Single Cells. Nucleic Acids Res. 2019, 48, D77–D83. [Google Scholar] [CrossRef]
  85. Li, J.-H.; Liu, S.; Zhou, H.; Qu, L.-H.; Yang, J.-H. starBase v2.0: Decoding miRNA-ceRNA, miRNA-ncRNA and Protein–RNA Interaction Networks from Large-Scale CLIP-Seq Data. Nucl. Acids Res. 2014, 42, D92–D97. [Google Scholar] [CrossRef]
  86. Peltier, D.C.; Roberts, A.; Reddy, P. LNCing RNA to Immunity. Trends Immunol. 2022, 43, 478–495. [Google Scholar] [CrossRef]
  87. Kelly, A.; Trowsdale, J. Genetics of Antigen Processing and Presentation. Immunogenetics 2019, 71, 161–170. [Google Scholar] [CrossRef]
  88. Zhao, Y.; Forst, C.V.; Sayegh, C.E.; Wang, I.-M.; Yang, X.; Zhang, B. Molecular and Genetic Inflammation Networks in Major Human Diseases. Mol. Biosyst. 2016, 12, 2318–2341. [Google Scholar] [CrossRef]
  89. Ohkura, N.; Sakaguchi, S. Transcriptional and Epigenetic Basis of Treg Cell Development and Function: Its Genetic Anomalies or Variations in Autoimmune Diseases. Cell Res. 2020, 30, 465–474. [Google Scholar] [CrossRef]
  90. Thakore, P.I.; Schnell, A.; Huang, L.; Zhao, M.; Hou, Y.; Christian, E.; Zaghouani, S.; Wang, C.; Singh, V.; Singaraju, A.; et al. BACH2 Regulates Diversification of Regulatory and Proinflammatory Chromatin States in TH17 Cells. Nat. Immunol. 2024, 25, 1395–1410. [Google Scholar] [CrossRef]
  91. Lomelí, H. ZMIZ Proteins: Partners in Transcriptional Regulation and Risk Factors for Human Disease. J. Mol. Med. 2022, 100, 973–983. [Google Scholar] [CrossRef]
  92. Raghavan, N.S.; Dumitrescu, L.; Mormino, E.; Mahoney, E.R.; Lee, A.J.; Gao, Y.; Bilgel, M.; Goldstein, D.; Harrison, T.; Engelman, C.D.; et al. Association Between Common Variants in RBFOX1, an RNA-Binding Protein, and Brain Amyloidosis in Early and Preclinical Alzheimer Disease. JAMA Neurol. 2020, 77, 1288–1298. [Google Scholar] [CrossRef]
  93. Bill, B.R.; Lowe, J.K.; Dybuncio, C.T.; Fogel, B.L. Orchestration of Neurodevelopmental Programs by RBFOX1: Implications for Autism Spectrum Disorder. Int. Rev. Neurobiol. 2013, 113, 251–267. [Google Scholar] [CrossRef]
  94. Pastore, S.F.; Ko, S.Y.; Frankland, P.W.; Hamel, P.A.; Vincent, J.B. PTCHD1: Identification and Neurodevelopmental Contributions of an Autism Spectrum Disorder and Intellectual Disability Susceptibility Gene. Genes 2022, 13, 527. [Google Scholar] [CrossRef]
  95. Umar, S.; Zhu, W.; Souza-Neto, F.; Bender, I.; Wu, S.C.; Healy, C.L.; O’Connell, T.D.; van Berlo, J.H. RBFOX1 Regulates Calcium Signaling and Enhances SERCA2 Translation. Cells 2025, 14, 664. [Google Scholar] [CrossRef]
  96. Pedrotti, S.; Giudice, J.; Dagnino-Acosta, A.; Knoblauch, M.; Singh, R.K.; Hanna, A.; Mo, Q.; Hicks, J.; Hamilton, S.; Cooper, T.A. The RNA-Binding Protein Rbfox1 Regulates Splicing Required for Skeletal Muscle Structure and Function. Hum. Mol. Genet. 2015, 24, 2360–2374. [Google Scholar] [CrossRef]
  97. Bergara-Muguruza, L.; Castellanos-Rubio, A.; Santin, I.; Olazagoitia-Garmendia, A. lncRNA Involvement in Immune-Related Diseases—From SNP Association to Implication in Pathogenesis and Therapeutic Potential. J. Transl. Genet. Genom. 2023, 7, 213–229. [Google Scholar] [CrossRef]
  98. Landrum, M.J.; Lee, J.M.; Riley, G.R.; Jang, W.; Rubinstein, W.S.; Church, D.M.; Maglott, D.R. ClinVar: Public Archive of Relationships among Sequence Variation and Human Phenotype. Nucleic Acids Res. 2014, 42, D980–D985. [Google Scholar] [CrossRef]
  99. Boyle, A.P.; Hong, E.L.; Hariharan, M.; Cheng, Y.; Schaub, M.A.; Kasowski, M.; Karczewski, K.J.; Park, J.; Hitz, B.C.; Weng, S.; et al. Annotation of Functional Variation in Personal Genomes Using RegulomeDB. Genome Res. 2012, 22, 1790–1797. [Google Scholar] [CrossRef]
  100. Burren, O.S.; Rubio García, A.; Javierre, B.-M.; Rainbow, D.B.; Cairns, J.; Cooper, N.J.; Lambourne, J.J.; Schofield, E.; Castro Dopico, X.; Ferreira, R.C.; et al. Chromosome Contacts in Activated T Cells Identify Autoimmune Disease Candidate Genes. Genome Biol. 2017, 18, 165. [Google Scholar] [CrossRef]
  101. Jones, S.A.; Cantsilieris, S.; Fan, H.; Cheng, Q.; Russ, B.E.; Tucker, E.J.; Harris, J.; Rudloff, I.; Nold, M.; Northcott, M.; et al. Rare Variants in Non-Coding Regulatory Regions of the Genome That Affect Gene Expression in Systemic Lupus Erythematosus. Sci. Rep. 2019, 9, 15433. [Google Scholar] [CrossRef]
Figure 1. Clustering of autoimmune diseases based on common genes in polygenic risk scores. (A) For PRSs up to 1000 genes. (B) For PRSs up to 500 genes. (C) For PRSs up to 250 genes. (D) For PRSs with the best AUC per disease.
Figure 1. Clustering of autoimmune diseases based on common genes in polygenic risk scores. (A) For PRSs up to 1000 genes. (B) For PRSs up to 500 genes. (C) For PRSs up to 250 genes. (D) For PRSs with the best AUC per disease.
Ijms 27 00543 g001
Figure 2. Pathway enrichment in identified clusters by common genes (A) across all PRS sizes (B) across PRS 1000.
Figure 2. Pathway enrichment in identified clusters by common genes (A) across all PRS sizes (B) across PRS 1000.
Ijms 27 00543 g002
Figure 3. For each AD, SNPs from PRSs were obtained from the PGS Catalog. Different SNP sets were then created: all SNPs from PRSs containing up to 250 variants were combined into one set, then the same was carried out for PRSs with up to 500, and up to 1000 variants. Additionally, SNPs for the five top-performing PRSs based on their AUC were selected for each AD. Each SNP was mapped to its corresponding gene, and a binary matrix was constructed, indicating the presence of SNPs in specific genes for each disease. Finally, tSNE clustering was applied to each SNP set to identify common patterns among the autoimmune diseases.
Figure 3. For each AD, SNPs from PRSs were obtained from the PGS Catalog. Different SNP sets were then created: all SNPs from PRSs containing up to 250 variants were combined into one set, then the same was carried out for PRSs with up to 500, and up to 1000 variants. Additionally, SNPs for the five top-performing PRSs based on their AUC were selected for each AD. Each SNP was mapped to its corresponding gene, and a binary matrix was constructed, indicating the presence of SNPs in specific genes for each disease. Finally, tSNE clustering was applied to each SNP set to identify common patterns among the autoimmune diseases.
Ijms 27 00543 g003
Table 1. Incidence and Prevalence of Polygenic Autoimmune Diseases.
Table 1. Incidence and Prevalence of Polygenic Autoimmune Diseases.
DiseaseTargetIncidencePrevalence
T1DMPancreas15 per 100,000 people [24]0.12% in Europe [24]
Crohn’s diseaseGastrointestinal tract3–20 per 100,000 [25]0.2% to 0.3% [26]
Ulcerative colitisGastrointestinal tract20 per 100,000
[27]
0.3% [27]
Systemic lupus erythematosusMultiple organs1–8.7 per 100,000 [28]0.03% to 0.05%
[28]
Hashimoto’s thyroiditisThyroid gland80 (male)–350 (female) cases per 100,000 [29]7.5% [30]
PsoriasisSkin78.9 per 100,000 [31]
30.3–321.0 per 100,000 person years [32]
0.14–1.99% [32]
Rheumatoid arthritisSynovial tissue208.8 cases (186.8–241.1) per 100,000 [33]2.45% [33]
AsthmaLungs477.92 per 100,000 [34]10.0–13.2% [35]
Celiac diseaseSmall intestine7.8 (male)–17.4 (female) per 100,000 person years [36]1.4% [37]
Multiple sclerosisCentral nervous system2.1 [95% CI: 2.09, 2.12] per 100,000 person years [38]2.8 million (0.04%)
[38]
Table 2. Most common genes in PRS contain up to 1000 variants across ADs.
Table 2. Most common genes in PRS contain up to 1000 variants across ADs.
GeneDiseasesComment
TSBP1-AS1Asthma, Celiac_disease, Colitis, Multiple_sclerosis, Psoriasis, Rheumatoid_arthritis, T1DM, Thyroid_diseaseTSBP1-AS1 is a long non-coding RNA (lncRNA) that is thought to regulate the expression of proximal genes, including immune-related genes within the MHC region. Its exact function is still being explored, but it may play a role in autoimmune disease susceptibility (https://www.ncbi.nlm.nih.gov/gene/10665, assessed on 15 July 2025).
GSDMBAsthma, Colitis, Crohn’s_disease, Psoriasis, Rheumatoid_arthritis, T1DM, Thyroid_diseaseGSDMB (Gasdermin B) is involved in cell death processes like pyroptosis, a form of programmed cell death linked to inflammation. It has been associated with asthma, inflammatory bowel disease, and some cancers [39].
TNXBCeliac_disease, Multiple_sclerosis, Psoriasis, Rheumatoid_arthritis, SLE, T1DM, Thyroid_diseaseThe TNXB gene encodes tenascin-XB, a member of the tenascin family of extracellular matrix glycoproteins. Mutations are linked to Ehlers–Danlos syndrome and may affect immune regulation [40].
IL2RAAsthma, Colitis, Multiple_sclerosis, Psoriasis, Rheumatoid_arthritis, SLE, T1DMIL2RA encodes the alpha chain of the interleukin-2 receptor, crucial for T cell function, and immune regulation. Variants in IL2RA are linked to autoimmune diseases like type 1 diabetes and multiple sclerosis [41].
SH2B3Asthma, Celiac_disease, Psoriasis, Rheumatoid_arthritis, SLE, T1DM, Thyroid_diseaseThe SH2B3 gene, also known as LNK, encodes a member of the SH2B adaptor protein family, which plays a pivotal role in hematopoiesis and acts as a key negative regulator of cytokine signaling. The protein product of SH2B3 is involved in various signaling pathways initiated by growth factors and cytokines, and its function is critical for the proper regulation of these pathways (https://www.ncbi.nlm.nih.gov/gene/10019, assessed on 15 July 2025).
IRF5Asthma, Colitis, Crohn_disease, Psoriasis, Rheumatoid_arthritis, SLE, Thyroid_diseaseIRF5 (Interferon Regulatory Factor 5) is a transcription factor involved in the regulation of type I interferon and pro-inflammatory cytokines. It plays a central role in immune response and has been implicated in lupus and other autoimmune conditions [42].
TSBP1Asthma, Celiac_disease, Multiple_sclerosis, Psoriasis, Rheumatoid_arthritis, T1DM, Thyroid_diseaseTSBP1 is a gene located within the major histocompatibility complex (MHC) region, and, while its precise biological function is not fully understood, it is thought to be involved in transcriptional regulation or chromatin organization. It may also have a regulatory role in immune-related pathways due to its proximity to immune genes. Research into its function is ongoing, particularly regarding its potential link to autoimmune diseases [43].
TAP2Asthma, Celiac_disease, Colitis, Psoriasis, Rheumatoid_arthritis, T1DM, Thyroid_diseaseTAP2 encodes a membrane-associated protein that plays a crucial role in the immune system by forming a heterodimer with ABCB2, another transporter protein, to facilitate the transport of peptides from the cytoplasm into the endoplasmic reticulum (https://www.ncbi.nlm.nih.gov/gene/6891, assessed on 15 July 2025) [44].
BACH2Asthma, Multiple_sclerosis, Rheumatoid_arthritis, SLE, T1DM, Thyroid_diseaseThe BACH2 gene is a critical regulator of the primary adaptive immune response, with a role in the development and function of T cells and B cells (https://www.ncbi.nlm.nih.gov/gene/60468, assessed on 15 July 2025). In the immune system, BACH2 has been linked to the regulation of gene expression through epigenetic mechanisms, such as the methylation of histone H3 at lysine 79 (H3K79me), mediated by DOT1L [45].
HLA-DQB1Asthma, Celiac_disease, Colitis, Rheumatoid_arthritis, T1DM, Thyroid_diseaseHLA-DQB1 is a key gene in the MHC class II region that helps present antigens to CD4+ T cells. Its alleles are strongly associated with autoimmune diseases, such as celiac disease and type 1 diabetes [46].
HLA-DRAAsthma, Celiac_disease, Multiple_sclerosis, Psoriasis, Rheumatoid_arthritis, T1DMHLA-DRA encodes the alpha chain of the HLA-DR antigen, which is a major histocompatibility complex (MHC) class II molecule (https://www.ncbi.nlm.nih.gov/gene/3122, assessed on 15 July 2025).
RBFOX1Asthma, Celiac_disease, Colitis, Crohn_disease, Psoriasis, T1DMRBFOX1 is an RNA-binding protein that regulates alternative splicing in the nervous system and heart. It has been linked to neurodevelopmental disorders such as autism, epilepsy, and schizophrenia [47].
PTCHD1-ASAsthma, Celiac_disease, Colitis, Psoriasis, Rheumatoid_arthritis, T1DMPTCHD1-AS is a non-coding RNA that may regulate the PTCHD1 gene, which is involved in neural development. Variants in this region are associated with intellectual disability and autism spectrum disorders [48].
LOC124901301Asthma, Celiac_disease, Multiple_sclerosis, SLE, T1DM, Thyroid_diseaseLOC124901301 is a predicted or uncharacterized genomic locus; its biological function is currently unknown. Further research is needed to determine its role.
HLA-DPB1Asthma, Celiac_disease, Multiple_sclerosis, Rheumatoid_arthritis, T1DM, Thyroid_diseaseHLA-DPB1 is part of the MHC class II complex involved in presenting peptides to T-helper cells. It has associations with various immune responses, including transplant compatibility and autoimmune diseases.
CSMD1Celiac_disease, Colitis, Psoriasis, Rheumatoid_arthritis, T1DM, Thyroid_diseaseThe gene product of CSMD1 functions as a complement control protein. CSMD1 is thought to act as a tumor suppressor and is involved in complement system regulation and neural development. It has been implicated in schizophrenia and some cancers [49].
HCG20Celiac_disease, Multiple_sclerosis, Psoriasis, Rheumatoid_arthritis, T1DM, Thyroid_diseaseHCG20 is a non-coding RNA gene located in the MHC region on chromosome 6; its function is not well-understood. It may be involved in regulating immune gene expression or linked to disease susceptibility through genetic proximity (https://www.ncbi.nlm.nih.gov/gene/?term=HCG20, assessed on 15 July 2025).
BTNL2Celiac_disease, Colitis, Multiple_sclerosis, Psoriasis, Rheumatoid_arthritis, T1DMBTNL2 encodes a butyrophilin-like protein that modulates T-cell activation and immune response. Variants are associated with sarcoidosis and other inflammatory conditions [50].
ERAP1Asthma, Crohn’s_disease, Psoriasis, Rheumatoid_arthritis, SLE, Thyroid_diseaseERAP1 (Endoplasmic Reticulum Aminopeptidase 1) trims peptides for presentation on MHC class I molecules. It is strongly associated with autoimmune diseases, like ankylosing spondylitis [51,52].
CNTN5Asthma, Crohn’s_disease, Multiple_sclerosis, Rheumatoid_arthritis, T1DM, Thyroid_diseaseCNTN5 encodes Contactin-5, a neural adhesion molecule important for brain development. It has been linked to autism spectrum disorders and cognitive dysfunction [53].
ZMIZ1Celiac_disease, Multiple_sclerosis, Psoriasis, Rheumatoid_arthritis, SLE, T1DMZMIZ1 (zinc finger, MIZ-type containing 1) is a gene that encodes a protein belonging to the PIAS (protein inhibitor of activated STAT) family, which plays a crucial role in regulating the activity of various transcription factors, including the androgen receptor, Smad3/4, and p53. The protein is also implicated in the process of sumoylation.
The gene also has a connection to the immune system, as evidenced by its relationship with the novel soluble immune system factor ISRAA. ISRAA is nested within intron 6 of the mouse Zmiz1 gene and has been shown to play a role in modulating anti-infection immunity by downregulating T-cell activation [54].
CLEC16AAsthma, Multiple_sclerosis, Rheumatoid_arthritis, SLE, T1DM, Thyroid_diseaseCLEC16A encodes a protein involved in autophagy and antigen presentation processes. It is a strong susceptibility gene for autoimmune diseases including type 1 diabetes and multiple sclerosis [55].
Table 3. Most significant SNPs across the top shared genes according to the literature analysis and their effect on molecular pathways, as shown in the Regulome database.
Table 3. Most significant SNPs across the top shared genes according to the literature analysis and their effect on molecular pathways, as shown in the Regulome database.
rsIDGeneSourceTypeLocationLOE Regulome DBDescriptionSource
rs41521946BTNL2PGS001306missenseintron1frs41521946 was shown to correlate with the development of knee osteoarthritis.[57]
rs2076530BTNL2PGS001309 PGS001310missenseintron1fA study has shown that, in addition to being associated with sarcoidosis, rs2076530 may play a role in rheumatoid arthritis.[58]
A transition constituting rs2076530 leads to the use of a cryptic splice site located 4 bp upstream of the affected wild-type donor site. Transcripts of the risk-associated allele have a premature stop in the spliced mRNA. The resulting protein lacks the C-terminal IgC domain and transmembrane helix, thereby disrupting the membrane localization of the protein, as shown in experiments using green fluorescent protein and V5 fusion proteins.[59]
rs27044ERAP1PGS002293missenseexon1fAccording to meta analysis, the rs27044 polymorphism was significantly associated with ankylosing spondylitis susceptibility in the overall population: rs27044, G versus C, OR = 1.24, 95% CI 1.16–1.33, p < 0.001. When stratified by ethnicity, rs27044 appeared to be significantly correlated with AS in both Asians and Caucasians. [56]
This polymorphism is also known to be associated with psoriasis.[60]
rs26653ERAP1PGS001345missenseexon1fThis polymorphism is known to be associated with psoriasis.[60]
rs2549782ERAP1PGS001043missenseintron1fThis polymorphism is known to be associated with ankylosing spondylitis.[61]
rs30187ERAP1PGS001312missenseexon1bSignificant epistatic interaction was observed between HLA-C*06 and the SNP (rs27044) located at the peptide-binding cavity of ERAP1. Evolutionary conservation analysis among mammals showed confinement of Lys528 and Gln730 within highly conserved regions of ERAP1 and suggested the possible detrimental effect of this allele in ERAP1 regulation.[62]
It was shown that there is a significant association between rs30187 polymorphisms and psoriasis susceptibility (T vs. C, OR = 1.23, 95% CI: 1.15–1.32, p < 0.00001).[63]
rs2305480GSDMBPGS000037missenseexon1fGWAS risk loci study showed that this polymorphism is a risk factor for asthma development.[64]
rs2305479GSDMBPGS004252missenseexon1fIt was shown that this polymorphism is associated with allergic rhinitis in the Chinese population.[65]
rs9277471HLA-DPB1PGS001301missenseintron1bIt was shown that that rs9277471 is associated with multiple sclerosis in GWAS pathway analysis.[66]
rs1130399HLA-DQB1PGS001306missenseintron1fIt was shown that that rs1130399 has an association with Alzheimer’s disease.[67]
rs1130368HLA-DQB1PGS001310missenseintron1fIt was shown that this polymorphism is associated with inflammatory bowel in the Japanese population.[68]
rs7192HLA-DRAPGS001313missenseintron1fA psoriasis genome-wide association study (GWAS) dataset that included 436,192 SNPs in 1409 psoriasis cases and 1436 controls of European descent, and a BD GWAS dataset that contained 310,324 SNPs in 1215 BD cases; 1278 controls were used in this study. Identify candidate causal SNPs and pathways (ICSNPathway) analysis was applied to the GWAS datasets, which identified 15 candidate causal SNPs and 28 candidate causal pathways. The top five candidate causal SNPs were rs1063478 (p = 1.45 × 10−10), rs8084 (p = 2.20 × 10−8), rs7192 (p = 5.18 × 10−8), rs20541 (p = 5.30 × 10−6), and rs1130838 (p = 5.65 × 10−6), which with the exception of rs20541 [interleukin (IL)-13] are at the human leukocyte antigen (HLA) loci. [69]
Since this SNP (rs7192, HLA00662.1:g.4276G>T p.Val217Leu) lies within exon 4, in the region encoding the cytoplasmic tail, the resulting protein is effectively monomorphic. For this reason, in-depth studies on HLA-DRA and its function have been limited. However, analysis of sequences from the 1000 Genomes Project and preliminary data from our lab revealed an unrepresented polymorphism within HLA-DRA, suggesting a more complex role within the MHC than previously assumed. This study focused on elucidating the extent of HLA-DRA polymorphism, and extending our understanding of the gene’s role in HLA-DR~HLA-DQ haplotypes. Ninety-eight samples were sequenced for full-length HLA-DRA, and from this analysis, we identified 20 novel SNP positions in the intronic sequences within the 5711 bp region represented in IPD-IMGT/HLA. This polymorphism gives rise to at least 22 novel HLA-DRA alleles, and the patterns of intronic and 3’ UTR polymorphism correspond to HLA-DRA~HLA-DRB345~HLA-DRB1~HLA-DQB1 haplotypes. The current understanding of the organization of the genes within the HLA-DR region assumes a single lineage for the HLA-DRA gene, as opposed to multiple gene lineages, such as in HLA-DRB.[70]
rs3184504SH2B3PGS001345missenseintron1fIt was shown that rs3184504 has an association with celiac disease.[71]
The rs3184504*T allele is associated with a loss-of-function amino acid change (p.R262W) in the adaptor protein SH2B3, a likely causal variant. The peritoneal infiltrating cells exhibited augmented phagocytosis in Sh2b3 −/− mice with enriched recruitment of Ly6Chi inflammatory monocytes despite equivalent or reduced chemokine expression. Rapid cycling of monocytes and progenitors occurred uniquely in Sh2b3 −/− mice following CLP, suggesting augmented myelopoiesis.[72]
rs185819TNXBPGS001296missenseintron1fGWAS association with systemic lupus erythematosus and rheumatoid arthritis.[73]
Table 4. Number of common genes across clusters in different analysis sets.
Table 4. Number of common genes across clusters in different analysis sets.
ClusterPRS 250 GenesPRS 500 GenesPRS 1000 GenesCommon in Across PRS SizesCommon in Top-5 AUC
C1—Celiac Disease, T1D21128941837
C2—RA, SLE, AIT9122386
C3—Colitis, Crohn’s disease1618401640
Common gene between clusters00000
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shchekina, V.S.; Batashkov, N.A.; Maznina, A.A.; Krupinova, J.A.; Bogdanov, V.P.; Korobeinikova, A.V.; Tychinin, D.I.; Glushkova, O.V.; Petriaikina, E.S.; Svetlichnyy, D.V.; et al. Identifying a Common Autoimmune Gene Core as a Tool for Verifying Biological Significance and Applicability of Polygenic Risk Scores. Int. J. Mol. Sci. 2026, 27, 543. https://doi.org/10.3390/ijms27010543

AMA Style

Shchekina VS, Batashkov NA, Maznina AA, Krupinova JA, Bogdanov VP, Korobeinikova AV, Tychinin DI, Glushkova OV, Petriaikina ES, Svetlichnyy DV, et al. Identifying a Common Autoimmune Gene Core as a Tool for Verifying Biological Significance and Applicability of Polygenic Risk Scores. International Journal of Molecular Sciences. 2026; 27(1):543. https://doi.org/10.3390/ijms27010543

Chicago/Turabian Style

Shchekina, Victoria Sergeevna, Nikita Aleksandrovich Batashkov, Anna Arkadievna Maznina, Julia Aleksandrovna Krupinova, Viktor Pavlovich Bogdanov, Anna Vasilievna Korobeinikova, Dmitry Igorevich Tychinin, Olga Valentinovna Glushkova, Ekaterina Sergeevna Petriaikina, Dmitry Vladimirovich Svetlichnyy, and et al. 2026. "Identifying a Common Autoimmune Gene Core as a Tool for Verifying Biological Significance and Applicability of Polygenic Risk Scores" International Journal of Molecular Sciences 27, no. 1: 543. https://doi.org/10.3390/ijms27010543

APA Style

Shchekina, V. S., Batashkov, N. A., Maznina, A. A., Krupinova, J. A., Bogdanov, V. P., Korobeinikova, A. V., Tychinin, D. I., Glushkova, O. V., Petriaikina, E. S., Svetlichnyy, D. V., Woroncow, M., Yudin, V. S., Keskinov, A. A., Yudin, S. M., Skvortsova, V. I., Tabakov, D. V., Deviatkin, A. A., & Volchkov, P. Y. (2026). Identifying a Common Autoimmune Gene Core as a Tool for Verifying Biological Significance and Applicability of Polygenic Risk Scores. International Journal of Molecular Sciences, 27(1), 543. https://doi.org/10.3390/ijms27010543

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop