The Search of Association of HLA Class I and Class II Alleles with COVID-19 Mortality in the Russian Cohort

HLA genes play a pivotal role in the immune response via presenting the pathogen peptides on the cell surface in a host organism. Here, we studied the association of HLA allele variants of class I (loci A, B, C) and class II (loci DRB1, DQB1, DPB1) genes with the outcome of COVID-19 infection. We performed high-resolution sequencing of class HLA I and class II genes based on the sample population of 157 patients who died from COVID-19 and 76 patients who survived despite severe symptoms. The results were further compared with HLA genotype frequencies in the control population represented by 475 people from the Russian population. Although the obtained data revealed no significant differences between the samples at a locus level, they allowed one to uncover a set of notable alleles potentially contributing to the COVID-19 outcome. Our results did not only confirm the previously discovered fatal role of age or association of DRB1*01:01:01G and DRB1*01:02:01G alleles with severe symptoms and survival, but also allowed us to single out the DQB1*05:03:01G allele and B*14:02:01G~C*08:02:01G haplotype, which were associated with survival. Our findings showed that not only separate allele, but also their haplotype, could serve as potential markers of COVID-19 outcome and be used during triage for hospital admission.


Introduction
In 2020, humankind was threatened by the epidemic that affected almost all countries to a certain degree. It was caused by a new virus from the Coronaviridae family-SARS-CoV-2 [1]. The first outbreak of the disease was registered in Wuhan (China), and then it rapidly spread all over the globe. By the moment this article has been prepared, the number of registered cases exceeded 650 million, over six million being lethal [2]. In January 2020, the World Health Organization (WHO) declared the outbreak to be a pandemic and termed the disease caused by SARS-CoV-2 virus as "COVID-19".
One of the essential factors facilitating such a rapid spread of COVID-19 is a striking variability in the disease manifestations, as well as the mortality rate in individual patients [3][4][5]. The accumulated data on SARS-CoV-2 suggest correlation between the disease outcome and sex, age, and concurrent diseases [6]; the mortality rate can also be associated with individual genetic characteristics of a patient [7][8][9]. One of genetic predictors for the adverse disease outcome could be HLA class I and II genes encoding the proteins of the human leukocyte antigen (HLA).
Based on the sequencing of the alleles of HLA class I and II genes and the protein composition of SARS-CoV-2, authors analyzed the affinity level of MHC binding to all possible viral epitopes [15,16]. The lowest predicted level of the interaction with viral antigens belonged to the protein encoded by B*46:01 allele, while the highest level belonged to the B*15:03 allele [15]. Another work, however, showed a negative correlation between the mortality rate and the frequency of HLA-DRB1*01:01 in the Mexican population (n = 26, R = −0.44, p-value = 0.02) [17]. The research on the influence of HLA genotype on COVID-19 severity revealed a significant difference in the allele frequency of HLA-DRB1*04:01 in severe patients as compared to the asymptomatic staff group in the European population (5.1% vs. 16.7%, p-value = 0.003 after adjustment for age and sex) [18]. In the research, which included the Russian population sample, mortality risk score was computed. As a result, the authors showed statistically significant correlations between the A*02:01 and A*03:01 alleles with a low mortality risk score and A*01:01 with a high mortality risk score [19].
Thus, HLA alleles may yield different results in terms of statistical significance depending on the studied population [7,8,17,18,[20][21][22][23][24][25][26][27]. For instance, two large (n 1 = 1980, n 2 = 332) GWA studies produced quite opposite results: one of them failed to show any association between COVID-19 and HLA genes in European population [8], while the other one revealed three alleles (HLA-A*11:01, HLA-B*51:01, HLA-C*14:02) triggering the most severe disease outcomes in the Chinese population sample [7]. This variation in results may be related both to the specific frequencies of individual HLA alleles in populations and to the frequencies of the haplotypes they form. The loci A, B, C (class I) and loci DRB1, DQB1, and DPB1 (class II) of HLA genes are the most variable among all HLA genes, which determines their different affinity for the same antigen [15]. Furthermore, the linkage disequilibrium between alleles within a locus, as well as the haplotype frequencies within a population, may have a cumulative effect on antigen presentation [13,24].
In our work, we sequenced six HLA loci (class I (loci A, B, C) and class II (loci DRB1, DQB1, DPB1)) to compare the frequencies of HLA alleles and HLA haplotypes in three groups: (1) healthy donors from bone marrow registry, (2) patients who survived COVID-19 despite a severe course of a disease, and (3) patients who died from COVID-19 with an adjustment for age and comorbidities. This work aims to identify and validate the alleles significantly related to various COVID-19 outcomes. Therefore, we performed a retrospective analysis in order to reveal any possible associations between the identified alleles and the disease outcomes in the Russian population. First, differences in sex, age, and allele frequencies were assessed using Fisher's exact test and Pearson's chi-squared test. Then, Hardy-Weinberg equilibrium and nonequilibrium linkage were evaluated for each group and each locus. After that, haplotype frequencies were assessed using the maximum likelihood method, and differences in frequencies were evaluated using Fisher's exact test and the t-test. In the final step, logistic regression models, including sex, age, and HLA locus or allele, were constructed.

The Age and Sex Distribution in the Studied Samples
Females predominated in all studied groups ( Figure 1). Using the Pearson's chi-squared test, we detected a statistically significant deviation between the groups (Table 1).
Pairwise comparisons revealed statistically significant differences between groups 1 and 3 (Table 2). In group 2B, the number of males (16.67%) was statistically significantly lower compared to the other selected age groups (from 39.19% to 54.5%). We did not detect any significant differences between the other groups. Using the Pearson's chi-squared test, we detected a statistically significant deviation between the groups (Table 1). Pairwise comparisons revealed statistically significant differences between groups 1 and 3 (Table 2). In group 2B, the number of males (16.67%) was statistically significantly lower compared to the other selected age groups (from 39.19% to 54.5%). We did not detect any significant differences between the other groups. The age distribution in groups is shown in Figure 2. Only in the second group did we observe a distribution that does not differ from normal (Shapiro-Wilk test, p = 0.72). In the other groups, there was a marked shift toward older age (group 3, Shapiro-Wilk test, p < 0.05) or younger age (group 1, Shapiro-Wilk test, p < 0.05). For this reason, the nonparametric Mann-Whitney test was chosen as the statistical test.  The age distribution in groups is shown in Figure 2. Only in the second group did we observe a distribution that does not differ from normal (Shapiro-Wilk test, p = 0.72). In the other groups, there was a marked shift toward older age (group 3, Shapiro-Wilk test, p < 0.05) or younger age (group 1, Shapiro-Wilk test, p < 0.05). For this reason, the non-parametric Mann-Whitney test was chosen as the statistical test.
The statistically significant differences in the age were observed in all compared groups (Table 3). The statistically significant differences in the age were observed in all compared groups (Table 3). The average age of deceased patients homozygous for at least one allele of class I loci (A, B, C) was lower as compared to the patients without homozygous loci (the Mann-Whitney U test; p-value < 0.05). At the same time, we did not detect any statistically significant differences in the age of patients homozygous for class I or class II loci, class II loci (in combinations), or homozygous for at least one locus.

The Distribution of Allele Frequencies, the Hardy-Weinberg Equilibrium and the Influence of the Gene Linkage Disequilibrium in the Studied Samples
The data obtained from high-resolution HLA typing for each patient included the information on both alleles of A*, B*, C*, DRB1*, DQB1*, and DPB1* loci of genes in the HLA histocompatibility complex (see HLA genotypes of all groups in Supplementary Table S1). Each allele of every gene that exhibited significant differences in the allele frequencies on the locus and allele level was analyzed in detail. The distribution of allele frequencies over six loci from three groups is shown in Supplementary Figure S1a-f.
The Hardy-Weinberg equilibrium is disrupted in the group 1 for the HLA-A locus and in the group 3 for the HLA-B and HLA-C loci (Table 4).  The average age of deceased patients homozygous for at least one allele of class I loci (A, B, C) was lower as compared to the patients without homozygous loci (the Mann-Whitney U test; p-value < 0.05). At the same time, we did not detect any statistically significant differences in the age of patients homozygous for class I or class II loci, class II loci (in combinations), or homozygous for at least one locus.

The Distribution of Allele Frequencies, the Hardy-Weinberg Equilibrium and the Influence of the Gene Linkage Disequilibrium in the Studied Samples
The data obtained from high-resolution HLA typing for each patient included the information on both alleles of A*, B*, C*, DRB1*, DQB1*, and DPB1* loci of genes in the HLA histocompatibility complex (see HLA genotypes of all groups in Supplementary Table  S1). Each allele of every gene that exhibited significant differences in the allele frequencies on the locus and allele level was analyzed in detail. The distribution of allele frequencies over six loci from three groups is shown in Supplementary Figure S1a-f.
The Hardy-Weinberg equilibrium is disrupted in the group 1 for the HLA-A locus and in the group 3 for the HLA-B and HLA-C loci (Table 4). The linkage disequilibrium was more prominent in group 1. All analyzed loci in this group were related statistically significant to each other. In group 2, the linkage between the HLA-A and HLA-DPB1 (p-value = 0.12), as well as HLA-C and HLA-DPB1 (p-value = 0.51), was statistically insignificant. In group 3, statistically insignificant linkage was observed only between the following pairs: HLA-A and HLA-DQB1 (p-value = 0.25), HLA-A and HLA-DPB1 (p-value = 0.88), and HLA-C and HLA-DPB1 (p-value = 0.08). Based on the obtained results, we further performed a haplotype analysis for all six loci (A, B, C, DRB1, DQB1, DPB1), as well as for five loci (A, B, C, DRB1, DQB1), for the class I and II loci, and the pairs of the HLA-B and HLA-C demonstrating statistically significant deviations from the Hardy-Weinberg equilibrium.

The Distances upon Pairwise Comparisons
The data from three groups were analyzed using the method of distances upon pairwise comparisons [28] by the fixation index F st [29] (Table 5). Zero and negative values of F ST usually indicate the absence of genetic stratification between the populations, while positive values show the presence of differences.
Results obtained from the comparison of genotype probability graphs of each sample population were consistent with the conclusions from the pairwise comparison, all group pairs exhibiting similar or at least slightly different behavior. We observed no statistically significant differences between the groups. Table 5. F st distance between groups.

Estimation of the Allele Distribution at Locus and Allele Levels
We applied the Pearson's goodness-of-fit test to each HLA locus to estimate the significance of allele distribution. A separate analysis of the three groups showed significant effects at a locus level (Table 6), except for the DQB1 locus analyzed by the V1 method. The analysis of the combined groups (1 + 2 vs. 3: healthy donors or recovered patients vs. patients died from COVID-19) by both methods did not reveal any notable difference at the locus level. At the next stage, we applied the Pearson's goodness-of-fit test to each allele (V2) or allele combination (V1) from three groups.
To identify significant alleles, we studied the combinations of other groups using the V1 and V2 methods. We analyzed different group combinations to determine significant individual alleles or their pairs associated with either a good or bad outcome (Tables 7 and S1). The direct comparative analysis of group 2 and 3 revealed no significant difference in allele frequencies, and neither did the comparison of different ages across the groups.
Finally, various combinations of groups were studied to determine significant alleles (the list of combinations in the Supplementary Table S7). Of note, the most interesting findings were represented by the difference of the DQB1*05:03 allele frequencies between groups 1 and 3.  Table S8). However, only the HLA-DQB1*05:03:01G allele showed statistical significance after the multiple comparison correction (Holm-Bonferroni).

Logistic Regression
First, for logistic regression, we excluded the HLA loci and used only patients' sex, age, and their interaction. Statistical significance was observed for the intercept (p < 0.05) and age (p < 0.05) ( Table 9). In order to avoid the effect of linkage disequilibrium and, as a consequence, the presence of strongly correlated independent variables, each locus was analyzed independently. In our logistic model, including only the locus HLA-A alleles, revealed statistical significance of the A*33:01:01G allele (p < 0.05), intercept (p < 0.05), and age (p < 0.05). We selected the best model by gradually excluding the predictors based on the AIC parameter and observed a loss of statistical significance by the allele.
The logistic model based only on the HLA-DRB1 locus alleles did not reveal any statistically significant determinants apart from the age. After selecting models by gradually excluding the predictors based on the AIC parameter, the statistical significance was shown for the HLA-DRB1*01:01:01G, HLA-DRB1*01:02:01G alleles (Table 12). The model including age, sex, and the HLA-DRB1*01:01:01G allele also demonstrated statistically significant results (p < 0.05) ( Table 13). The model based on age, sex, and the DRB1*01:02:01G allele did not confirm its statistical significance.  The logistic model including only the HLA-DQB1 locus alleles did not reveal statistically significant determinants apart from the age. After selecting models by gradual exclusion of predictors based on the AIC parameter, statistical significance was detected in case of the HLA-DQB1*05:01:01G allele (Table 14). The model based on sex, age, and the HLA-DQB1*05:01:01G allele was also found to be statistically significant (Table 15).  The logistic model including the DPB1 locus alleles did not reveal any statistically significant differences, except the age, for all alleles in this locus as well as for individual alleles.

Subjects
The group of recovered patients (group 2) with severe symptoms, and the patients who died from COVID-19 (group 3) were divided into subgroups (2A, 2B, 3A, 3B) according to the age (age < 65 or ≥ 65 at the moment of death/illness). Clinical features of the groups are presented in Tables 17-20, except 50 members from group 3 with no data other than HLA genotypes and the outcome. As a control sample population, we used 475 venous blood samples collected from the members of the National Registry of Bone Marrow Donors at the Pirogov Medical University in the beginning of 2020.

Biomaterial Collection
The exploited biomaterial consisted of venous whole blood collected into EDTA-coated tubes. Diagnostic criteria for inclusion to the study were fever and/or respiratory symptoms and the positive test for COVID-19 was confirmed by RT-qPCR test (to estimate viral RNA content)-named «SARS-CoV-2/SARS-CoV» (DNA Technology, Russia)-from nasopharyngeal swabs in Moscow clinical diagnostic laboratories that collected the biomaterial. Patients with pathologies that led to greater morbidity or who had additional immunosuppression (patients with HIV, active cancer in treatment with chemotherapy, immunodeficiency, autoimmune diseases with immunosuppressants, and transplants) were not included in the study.

gDNA Isolation, Library Preparation and Sequencing
gDNA was isolated from 100 uL of venous whole blood with the Proba-McheMaks (DNA Technology LLC, Moscow, Russia) reagent kit using the automated dosing station DTstream (DNA Technology LLC, Moscow, Russia). This method involved a routine step including lysis, DNA precipitation on magnet beads, three washing steps, and an elution step. Quality control of the isolated DNA was performed by agarose gel electrophoresis; the concentration was measured using Qubit 3 fluorometer with Qubit dsDNA BR Assay kit (ThermoFisher Scientific, Grand Island, NY, USA) (mean concentration-31.03 ng/uL, standard deviation-50.28 ng/uL, median-15.2 ng/uL, range-1.01-200 ng/uL).
The preparation of amplicon libraries for HLA high-resolution genotyping was performed using HLA Expert kit (DNA Technology LLC, Moscow, Russia) following the manufacturer's protocol (Kit was certified by Russian Federal Service for Surveillance in Healthcare (Roszdravnadzor)). It included several steps. The first stage involved a qPCR for human gene that does not have pseudogenes and is presented in a single copy. This was required for the estimation of a concentration and the presence of inhibitors in a genomic DNA sample. The results were used for normalization of DNA amount during the following step. The second stage involved a multiplex PCR for most variable exons (2, 3, 4 for the HLA class I and 2, 3 for the HLA class II). Primers were designed using conservative regions of gene introns flanking the exons. Several primers with one nucleotide shift were used to prevent an imbalance in nucleotide content during sequencing. The third stage involved ligation of the adapters containing Illumina i5 and i7 indexes. The fourth stage was an additional routine PCR (6 cycles) with the p5 and p7 primers. The purification with magnetic beads (SPRI type) was performed after each stage. Quality control of the libraries was performed using agarose gel electrophoresis; the concentration was measured using the Qubit 3 fluorometer with the Qubit dsDNA HS Assay kit (ThermoFisher Scientific, USA).
Sequencing was performed using the Illumina MiSeq platform (Illumina, San Diego, CA, USA) with the MiSeq Reagent Kit v3 (600-cycle), according to the manufacturer's protocol.
Fastq files were analyzed with HLA-Expert software (DNA Technology LLC, Moscow, Russia) following the manufacturer's instructions. Obtained exon sequences were aligned to the human major histocompatibility complex (MHC) sequences IMGT/HLA v3.41.0 [30].
Basic quality control metrics for QC included: • Quality threshold for reads (low quality reads were trimmed or discarded); • Lowest absolute and relative coverage for each position; • The highest number of differences (insertions, substitutions, deletions) from the group average for each read; • Maximum relative position error-the number of differences (insertions, substitutions, deletions) from the consensus sequence in each position should not exceed the specified threshold; • The highest average error per read for a group; • The lowest number of reads in groups for each exon (I-class 2,3,4 exons, II-class-2,3 exons); • The allelic imbalance should not exceed a given threshold; the ratio of the read number for the exons from each allele and the sum of these ratios; • The presence of phantom (cross-mapping) and chimeric sequences; • The percentage of combined, clustered, and used for typing reads computed for each sample.

Statistical Analysis
Allele frequencies in the analyzed cohorts were estimated by dividing the number of occurrences of a given allele in an individual by the doubled total number of individuals (alleles of homozygous individuals were counted as two occurrences). Statistical analysis included the Pearson's goodness-of-fit test (for the distribution of alleles in each HLA-locus, allele and allele combination, sex ratios in groups), the Fisher's exact test for determining the significances in differences between allele frequencies, the Wilcoxon rank sum test with continuity correction for estimating the differences in age between all groups. Arlequin (version 3.5.2.2) was used to conduct population assignment test, estimate the Hardy-Weinberg equilibrium, pairwise linkage disequilibrium, and measure the distances upon pairwise comparisons between all three groups [31]. We created several scripts in order to estimate the diversity of each gene and differences in the frequencies of individual alleles. Another script we had designed was aimed at correcting an input table with patients' data and transforming the names of HLA alleles following a unified syntax (https://github.com/genomecenter/HLA_article; accessed on 22 June 2021). We created a script that generated an input file containing patients' data for Arlequin.
In order to determine the significant alleles for each gene by compiling a contingency matrix, we used the Holm-Bonferroni method [32] with the significance level of 0.05 for multiple comparison correction. For that purpose, we designed a special script. Haplotype frequencies were estimated by Arlequin (version 3.5.2.2) using the expectation-maximum algorithm. Haplotype frequencies were determined for the class I loci, class II loci, 5 loci (A, B, C, DRB1, DQB1), and 6 loci (A, B, C, DRB1, DQB1, DPB1). The standard deviation was assessed by bootstrapping (n = 1000). The differences of mean frequencies of haplotypes between samples were compared with the t-test with mean haplotype frequencies and standard deviation (the number of identified haplotypes in samples were used as freedom degrees). After that, we checked the results by the Fisher's exact test.
We employed the Pearson's goodness-of-fit test to study each gene separately and analyzed the 2-field level of alleles for each gene. The null hypothesis stated that HLA did not affect the divergence of allele distribution and allele frequencies in the groups. For evaluating the role of single alleles and allele combinations, they were selected from the groups by two methods. The contingency matrix for each gene was compiled using one of two methods (Table 21). The first method employed allele combinations (both alleles in a pair) to produce a contingency table, which allows for estimating the significance of the impact of an allele combination present in each locus on the disease outcome. The second method-approximating to biological processes in organisms-enabled computing the allele carrier in each group. Table 21. The example of using two methods for collecting patient's data to produce a contingency matrix of an A gene.

Selection Principle The Example of the Data Freedom Degrees
The joint analysis of an allele pair (V1). We used logistic regression to discern the impacts of age and sex from the influence of alleles. We employed two models. The first model included only sex and age as independent variable and an outcome as a dependent variable. The second one used sex, age, and the presence of a certain allele as an independent variable and an outcome as a dependent variable. Analysis was conducted with glm and step from the R stats package (ver 4.2.0).

Discussion
Since the HLA genotype determines an individual repertoire of immune response to foreign pathogens, it could contribute to COVID 19 susceptibility and severity. Particularly, it is of importance to analyze certain alleles and haplotype frequencies in detail across different populations.
Several studies (see Table 5) suggested a number of alleles that might be statistically significant to predict a possible COVID-19 outcome. Different methods of computing significance showed specific for population results. Analysis of HLA protein affinity [15,16] showed that B*46:01 and C*12:03 or C*14:02 and A*02:01 alleles have a statistically significant association with COVID-19. At the same time, comparing the mortality level and the diversity of HLA alleles [17,20,27] allowed us to suggest the existence of other associated alleles (DRB1*01:01, C*05, A*02:01).
In our work, we scrutinized three groups comprising the representatives of the Russian population. We used various approaches for estimating the link between individual HLA alleles and generated haplotypes with the COVID-19 outcome. We found a significant bias in the sex and age composition of the population. Group 1 mostly consisted of females (71.2%), whereas group 3 included mostly males (56.1%). This may be explained by the statistically significantly association of the sex with the survival rate after severe disease as it was shown previously [1]. To the contrast, group 2B included fewer males (16.67%). Age is the significant factor, which was confirmed in our work, as well as in earlier works [1]. Estimating the Hardy-Weinberg equilibrium showed the disruption of an equilibrium in the HLA-B and HLA-C loci, which, together with the linkage disequilibrium, drew our attention to these loci. The list of all alleles that were statistically significantly associated with the disease outcomes is shown in the Table. Employing the first method of creating contingency tables (V1), we identified the statistically significant differences in the numbers of patients with the HLA-A*01:01:01G allele homozygosity between groups 3A and 3B (p < 0.05), group 1 and group 3A (p < 0.05), which is in line with the previous results [19]. However, after the multiple comparison corrections, the differences became statistically insignificant and were not detected upon comparison of groups 2 and 3. Estimating the mean age in group 3 showed that, on average, patients with class I allele homozygosity died more frequently, which also keeps up with the previous data [19]. Meanwhile, in group 2 or loci belonging to the other class, these did not display a similar association, implying a greater contribution of the class I loci to the disease severity.
Our findings reveal that due to linkage disequilibrium, the statistically significant alleles were combined into individual haplotypes that could predominate in the populations of deceased patients (B*27:02:01G~C*02:02:02G) or survivors (B*14:02:01G~C*08:02:01G). The obtained results are only partially consistent with the previous findings (DRB1*01:01, DRB1*01:02), which can be related to the population characteristics, as well as with the insufficient sample size that greatly decreased the statistical power of the analysis. Another decline in the analysis efficiency might arise from the possibility that, in the population, the alleles with high affinity toward viral peptide (C*08:02:01G, A*02:01:01G, C*02:02:02G) can be co-inherited with the alleles showing the lower affinity (B*14:02:01G, B*27:02:01G) (Ref. [15]). Except for DRB1*01:01:01G and DRB1*01:02:01G, none of the alleles or haplotypes that were statistically significant in our study were identified in other works (Table 22). This may be due to the specificity of the population frequencies of the alleles and haplotypes, which directly affects the power and possibility of applying statistical methods.
The main limitation of this work is the lack of detailed clinical data for patients, which does not allow for a study of the relationship between the analysis of HLA genotype or haplotypes and additional factors affecting survival, such as comorbidities, BMI, etc. Another important limitation is that the relatively small sample size, which could significantly reduce the power of the assessment methods used, especially for rare alleles and haplotypes.

Conclusions
In the present study, we analyzed HLA genotypes of three Russian population samples: healthy individuals, patients who survived severe COVID-19, and patients who died from it. Using the Fisher's exact test and Pearson's goodness-of-fit test, we performed haplotype frequencies analysis an logistic regression to show that the alleles of loci A, B, C, DRB1, DQB1 and DPB1 influenced the COVID-19 outcome. The immediate results of the research showed the absence of any significant difference between the groups at the locus level, however, several alleles proved to be perspective. These embrace the already known DRB1*01:01 and DQB1*05:03, detected in the current research, since they presumably influence the outcome of COVID-19. We also found a decrease in the frequency of one of the common haplotypes (B*14:02:01G~HLA-C*08:02:01G) in the group of deceased patients. On the contrary, the frequency of this haplotype in the group of survivors three times exceeded its occurrence in the control group. Still, the results allow us to conclude that the associations of HLA alleles with COVID-19 progression and outcome depend largely on individual characteristics of the population under investigation. In further work, we plan to collect samples of larger size and more detailed information on comorbidities, which will allow to obtain higher power for statistical criteria, as well as to make a more accurate assessment of the role of HLA genes and their haplotypes in the course of the disease.

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/ijms24043068/s1, Table S1: HLA-genotype for all samples; Table S2a: Estimated frequency for six HLA loci; Table S2b: Estimated frequency for five HLA loci; Table S2c: Estimated frequency for I class HLA loci; Table S2d: Estimated frequency for II class HLA loci; Table S2e: Estimated frequency for HLA-B and HLA-C loci;

Informed Consent Statement:
The informed consent for participation in the study was obtained from every patient or their closest relatives.
Data Availability Statement: The data described in this article are openly available in the Supplementary material.