Unique Polymorphisms at BCL11A, HBS1L-MYB and HBB Loci Associated with HbF in Kuwaiti Patients with Sickle Cell Disease

Patients with sickle cell disease (SCD) in Kuwait have elevated HbF levels ranging from ~10–44%; however, the modulating factors are unclear. We investigated the association of single nucleotide polymorphisms (SNPs) at BCL11A, HBS1L-MYB and HBB with HbF levels in 237 Kuwaiti SCD patients, divided into 3 subgroups according to their HbF levels. Illumina Ampliseq custom DNA panel was used for genotyping and confirmed by arrayed primer extension or Sanger sequencing. In the BCL11A locus, the CC genotype of rs7606173 [χ2 = 16.5] and (GG) of rs10195871 [χ2 = 15.0] were associated with Hb-F1 and HbF-2 subgroups, unlike rs1427404-T [χ2 = 17.3], which showed the highest association across the three subgroups. HBS1L-MYB locus revealed 2 previously-described SNPs (rs66650371 [χ2 = 9.5] and rs35795442 [χ2 = 9.2]) and 2 previously-unreported SNPs, (rs13220662 [χ2 = 6.2] and rs1406811 [χ2 = 6.7]) that were associated with the HbF-3 subgroup, making this the key locus elevating HbF to the highest levels. HBB cluster variants were associated with lower levels of HbF (β = −1.1). We report four previously-unpublished variants showing significant association with HbF. Each of the three quantitative trait loci affects HbF levels differently; unique SNPs, especially in HBS1L-MYB, elevate HbF to the highest levels.


Introduction
Sickle cell disease (SCD) is the most widespread monogenic disease worldwide with significant morbidity and mortality, associated with the HBB mutation, rs334 (Glu6→Val; GAG→GTG). Homozygotes for this mutation have HbSS or sickle cell anemia (SCA), which is the most severe form of the disease. Compound heterozygotes such as HbSC or HbSβ-thal usually have less severe, but variable phenotypes. Irrespective of the Hb genotype, however, SCD shows remarkable clinical heterogeneity. Several genetic and environmental factors modulate the disease phenotype, of which the fetal hemoglobin (HbF) level is the most potent.
The eventual level of HbF (α 2 γ 2 ) in patients with SCD is influenced by cisand transacting single nucleotide polymorphisms (SNPs) in known quantitative trait loci (QTLs) on chromosomes 11 (HBB), 2 (BCL11A) and 6 (HBS1L-MYB), respectively. Several SNPs have been described in each of these loci in SCD and thalassemia patients in different parts of the world [1][2][3]. The important cis-acting SNPs on the HBB locus are also related to the βS gene cluster haplotype. Thus, patients with the Arab/Indian (AI) and Senegal (SEN) haplotypes have the highest HbF levels and mildest phenotypes. These are also the haplotypes associated with the HBG2, -158 (C→T) XmnI SNP, namely rs7482144.
Kuwait is a small country in the Northeast corner of the Arabian Peninsula; yet it has a very heterogeneous population, with the early settlers having migrated mainly from Eastern Saudi Arabia in the early 18th century, but included people from, Iran, Iraq, North and East Africa, the Mediterranean and Southern Asia. In spite of this, patients with SCD in Kuwait predominantly carry the AI haplotype, with elevated HbF levels, although there is considerable variability [4][5][6], with normally-distributed values ranging from~10-40%. This suggests that there are multiple factors driving HbF expression in this group of patients. Previous investigations among patients in Kuwait and those from Eastern Saudi Arabia, carrying similar βS haplotypes, have failed to show a significant effect of modifier SNPs that were reported for patients from other populations [7][8][9]. It was therefore postulated that novel HbF-associated SNPs exist among Gulf Arabs, especially among patients with the highest HbF levels [8,10].
We have genotyped multiple SNPs at the BCL11A, HBS1L-MYB, HBB and Xp22 loci, to investigate whether unexplored SNPs are associated with high HbF levels in the Kuwaiti SCD population. The SNPs were selected based on detected association signals in a pilot study (unpublished data) and other common published variants. We categorized SCD patients according to their HbF levels into 3 subgroups (HbF-1, HbF-2 and HbF-3) and explored the association of the variants in the 3 QTLs with each subgroup. We hereby report novel variants at these loci, which have varying impacts on HbF expression across the 3 subgroups while some are unique to the group with the highest HbF levels. We take full cognizance of the need for large numbers of patients for this type of genomic study, therefore the data presented should be seen as preliminary, while further studies are underway.

Materials and Methods
The patients were drawn from consenting patients with SCD being followed in hematological clinics of Mubarak Al-Kabeer and Amiri Hospitals in Kuwait. The study was approved by the Human Research Ethics Committees of the Faculty of Medicine and the Kuwait Ministry of Health. The patients gave written consent or assent as appropriate.
Blood samples were drawn by venipuncture when the patients were in steady-state, i.e., without acute illness or crisis or having received blood transfusion in the 6 weeks preceding the study. Complete blood count (CBC) was obtained using an ABX Pentra 120 cell counter (ABX France, Montpellier), while Hb quantitation was achieved with cation-exchange high-performance liquid chromatography (HPLC) (Shimadzu LC-20AT, Shimadzu Corporation, Kyoto, Japan). Pre-treatment CBC and HbF values were used for data analysis among patients on hydroxyurea. The patients were divided into 3 subgroups based on their HbF levels: HbF-1 <20%, HbF-2 between 20 and 30% and HbF-3 with levels >30%.

Genotype Determination
Genomic DNA was extracted from peripheral leukocytes using the phenol-chloroform method. The Illumina Ampliseq custom DNA panel was used to genotype the DNA samples. For the HBB locus, all β-globin mutations and variants were confirmed by arrayed primer extension (APEX) or Sanger sequencing methods. The -158 (G→A) XmnI polymorphism in the HBG2 gene promoter (rs7482144) was confirmed by digestion of an amplified fragment. Genomic variations of NGS data were analyzed with Alamut Visual v.2.11 and v.2.13 and Illumina's Variant Interpreter software. APEX and Sanger sequencing results were analyzed using Genorama PicDb Autoscan 7.0 software (Asper Biotech) and GenomeLab GeXP Genetic Analysis System v. 11.0 (Beckman Coulter), respectively. β S HBB haplotypes were determined using a modification of the phase SNP method described by Shaikho et al. [11]. Detailed review of the full haplotype descriptions and analysis results with the newly described significant SNPs as well as β-thalassemia mutations detected in this study will be reported separately.

Statistical Analysis
Descriptive analyses were performed with IBM SPSS software, version 25 (IBM, New York, NY, USA). The criterion for statistical significance was p < 0.05. HbF measurements achieved in patients younger than 5 years old were excluded from statistical analysis because HbF may not have stabilized before this age. All genetic analyses, quality control (QC) measures, SNPs association statistical tests and linkage disequilibrium calculations were performed using the PLINK software package version 1.9 (https://www.cog-genomics.org/plink2, accessed on 28 February 2020). First, SNPs with statistically significant (p < 0.05) deviation from the Hardy-Weinberg Equilibrium (HWE) were excluded from downstream analysis. Additionally, monomorphic (non-variable) SNPs or SNPs with a minor allele frequency (MAF) <5% (threshold) were excluded from further analysis. In order to identify SNPs that present non-redundant information about the genomic structure, tagging SNPs (tagSNPs) were selected on a block-byblock basis. The differences in HbF % and genotypes of the tentative SNPs were evaluated by one-way analysis of variance (ANOVA). The Haploview software package 4.2 (http://www.broad.mit.edu/haploview, accessed on 7 October 2019) was used to determine pairwise linkage disequilibrium across the genomic regions under study.

General Characteristics
The study group consisted of 237 patients with SCD, made up of 65.8% HbSS and 34.2% HbSβ 0 thalassemia. We included 112 healthy individuals in order to perform prestatistical analysis among patients and healthy individuals and to generate haplotype patterns for the Kuwaiti population. Pre-statistical analysis revealed an overestimation of SNPs associated with the high HbF levels among the patients, thus the inclusion of healthy individuals failed to validate true association probably due to the wide significant differences (p ≤ 0.0001) in the HbF levels between patients (HbF mean = 23.3 ± 9.6%) and healthy individuals (HbF mean = 1.9 ± 2.3%) Therefore, the latter were excluded from genetic association analysis.
The mean age of the patients was 12.8 years with 44% females. There was no significant difference in the mean HbF values between the SS (22.7 ± 8.5%) and Sβ 0 -thal (24.1 ± 11.4%) groups. The mean HbF levels in the HbF-1, HbF-2 and HbF-3 subgroups were 16.8 ± 2.1, 23.4 ± 3.4, 34.3 ± 5.0%, respectively. The distribution of HbF levels among the patients is shown in Figure S1. Our results showed that HbF levels did not differ (p > 0.05) between the age groups. In the present analysis, HbF levels in female and male patients did not differ significantly (p > 0.05) in contrast to the previously published reports in the literature.

BCL11A Locus
We genotyped 28 SNPs, 13 of which showed significant association in the three HbF subgroups. Association results for all variants are presented in Supplementary Material Table S1. Our major finding in this locus was an intronic variant, rs1427407, in the DNase I hypersensitive site (DHS) +62, which was in strong linkage disequilibrium (LD) (r 2 ≥ 0.9) with three SNPs; rs766432, rs4671393, rs1896296; hence they are most likely tagging the same genetic signal. Subjects carrying the TT genotype of rs1427407 had the strongest association with HbF-2 subgroup (χ 2 = 17.3, β = 1.7) ( Table 1) and showed the highest HbF mean value as shown in Figure 1. The tagged SNPs showed similar significant trends in the HbF-2 subgroup (Table S1). The second significant association was rs7606173 (χ 2 = 16.5 and p = 6.1 × 10 −5 ) in DHS +55, which was in moderate LD (r 2 = 0.5) with rs6709302. The third strongest was with rs10195871 (χ 2 = 15 and p = 7.95 × 10 -5 ), which was in strong LD (χ 2 = 0.9) with rs10172646 and in moderate LD (r 2 = 0.6) with rs11886868.   All five SNPs mentioned above showed relatively low significance in the HbF-3 subgroup. Subjects carrying homozygosity for the minor alleles rs7606173 (CC) and rs10195871 (GG) had relatively lower HbF levels compared to other genotypes. These All five SNPs mentioned above showed relatively low significance in the HbF-3 subgroup. Subjects carrying homozygosity for the minor alleles rs7606173 (CC) and rs10195871 (GG) had relatively lower HbF levels compared to other genotypes. These findings were confirmed by the allelic change of the beta coefficient (β = −1.3 and β = −1.2, respectively) ( Table 1).
Stepwise regression analysis revealed that two SNPs, rs1427407 and rs10195871, are independently associated with HbF levels. These two variants are located in BCL11A intron 2, and are in weak LD (0.38) with each other. To further understand the joint effect of the combinations of rs1427407, rs10195871 and rs7606173 on the BCL11A HbF association signal, we performed a haplotype analysis. The three SNPs generated five haplotypes that represent 98.9% of all haplotypes at this locus. TAG and GGC haplotypes were more strongly associated with HbF, explaining 11% and 10% of the phenotypic variation in HbF levels, respectively (Table 2). Thus, these haplotypes explain more phenotypic variance than the cumulative sum of rs1427407 in BCL11A taken individually (5.8%; Table 1). In this context, patients carrying the TAG haplotype had higher HbF levels (25.9%) compared to subjects carrying a GGC (15.3%) haplotype. Interestingly, rs7569946, located in exon 4, was uniquely significant (χ 2 = 7.5, p = 0.006) for the HbF-3 subgroup only. The considerable effect of the A allele at this locus (β = 0.59) results in an elevation of mean HbF levels for genotypes carrying the minor allele (Table 1).

HBS1L-MYB Intergenic Region
In this case, 40 SNPs were genotyped and 20 of them showed exceptional significance for HbF-3 subgroup (mean = 34.3 ± 5%) (Table S2). In this region, a large number of SNPs had similar allele frequencies and were in strong LD suggesting that these markers flag the same causal polymorphism. HBS1L-MYB polymorphisms (HMIP), are distributed in three LD blocks [12]. The most effective one among these blocks is called HMIP-2 that is divided into sub-loci HMIP-2A and -2B. HMIP-2 was shown to influence disease severity in patients with SCD and beta thalassemia [13,14].
Our results highlighted 3 SNPs located in HMIP-2A; rs9399137, rs66650371 (3bp deletion) and rs35786788 that were in almost complete LD and conveyed the strongest impact in the HbF-3 subgroup (χ 2~9 .5 and p = 0.002), while in HbF-1 the impact of these 3 variants was diminished (p > 0.05) ( Table 1). Figure 2 shows the relationship between HbF levels and rs66650371; the homozygous carrier of the 3-bp deletion had the highest HbF levels.
Another similar subgrouping trend was found for rs9494145, located in the HMIP-2B sub-locus; it was observed to be in strong LD with rs9483788 and in moderate LD with rs6920211. Subjects carrying the CC genotype of rs9494145 had the strongest association with the HbF-3 subgroup (χ 2 = 8.5, p = 0.004). All of the three aforementioned SNPs maintained significant p values in HbF-2 and HbF-3 subgroups. In parallel, 3 other SNPs, rs4895441, rs9389269 and rs9402686 that were in strong LD, showed relatively less significant impact in the HbF-2 and HbF-3 subgroups (Table S2).
Notably, our findings revealed a different behavior of rs35959442 (previously named rs52090909), with a sole significance in the HbF-3 subgroup, similar to rs4895440 with which it was in strong LD. We also identified 4 other variants showing a similar trend in the HbF-3 subgroup (rs9402685, rs6930223, rs9376092 and rs9494142) but with less effect (Table S2).
Strikingly, we detected two previously-unpublished SNPs, rs13220662 and rs1406811, which were uniquely associated with HbF levels in the HbF-3 subgroup (Table 3). Indeed, the minor allele A of rs13220662 and rs1406811 had antagonistic effects on HbF levels. Similar to the previously-mentioned SNPs, rs34778774 also showed significance in the HbF-3 subgroup (Table S2).

Chromosome X Associations
Four SNPs on chromosome X were studied; a moderately strong associatio HbF levels was found for rs4969549 in Xp22.11, and for rs12559632 in Xp22.2 (P Rs12559632 showed significant results in the HbF-2 and HbF-3 groups (χ 2 = 4.9 an 7.9, p = 0.026 and p = 0.005, respectively) while rs4969549 was significant only in Hb = 9.6 and p = 0.008) (Table S4).

Discussion
The Gaussian distribution of HbF levels among patients with the AI haplotyp a unique opportunity to investigate the genomic drivers of HbF expression in SCD though Kuwaiti patients generally have elevated HbF levels, there is still a marked bility, leading us to hypothesize that a variety of genetic modifiers act in a stepwi probably synergistic manner, to drive HbF expression in this group. In order to inve the variants associated with different degrees of HbF elevation among our patien divided the patients into subgroups according to their HbF levels.
Previous genomic studies have identified several SNPs from QTLs on chromo 2p15, 6q23 and 11p16 in association with HbF levels and /F-cell numbers [1,3,13,15] of the reported SNPs likely tag the same genetic signal at each locus since they show erate to high LD. Indeed, our study confirms that polymorphisms in the BCL11A HBS1L, as well as the HBB locus, are associated with HbF levels.
BCL11A is a major regulator of hemoglobin gene switching [16] and a direct rep of HbF production [17]. Polymorphisms within the 14 kb intron 2 of BCL11A are ated with HbF levels in different populations [3,13,18]. In the present study, BCL  We evaluated the effect of multiple variants in the HBS1L-MYB region to detect independent signals of association. Applying stepwise regression on 20 SNPs, we identified two SNPs, rs666750371 and rs35959442, which were independently associated with HbF levels with a weak LD (0.46) between the two. At the same time, the HbF phenotypic variation was 3.4% and 2.7% with rs666750371 and rs35959442, respectively (Table 1).
In order to investigate the effect of the other SNPs by excluding rs666750371 and rs35959442, we applied additional stepwise regression on 18 SNPs and found that rs34778774 and rs4895440 remained independently significant, with HbF phenotypic variations of 3% and 1.6%, respectively.
Haplotype analysis with the four aforementioned SNPs (rs666750371, rs35959442, rs34778774 and rs4895440) generated four haplotypes that represent 98% of all haplotypes at this locus. Subjects representing 22GT (2: TA, 2: CC, respectively) were associated with significantly higher HbF levels (mean = 26.4%). On the other hand, those with the 11CA (1: TACTA, 1: CCC, respectively) haplotype were associated with relatively lower HbF levels (18.7%). The variance explained by these two haplotypes was 3.4% and 3.8%, respectively ( Table 2). Haplotypes identified in the HBS1L-MYB region did not show the trend of cumulative sum found in the haplotypes described in BCL11A.

HBB Locus
Genotyping was performed for 58 SNPs across the β-globin region that includes the locus control region (LCR) and the HBE1, HBG2, HBG1, HBBP1, HBD and HBB genes. Analysis of the markers showed a pattern of high LD across the entire HBB region; rs7482144 was in LD (≤0.7) with rs2855122, rs2855121, rs4910740, rs72872549 and rs67385638 but did not show the strongest impact on HbF levels. The latter SNP, rs67385638, which is an intronic variant of the HBE1 gene, showed the strongest association in HbF-1 and HbF-2 subgroups (χ 2 = 19.2 and χ 2 = 19.8, p = 1.7 × 10 −5 and p = 10 × 10 −6 , respectively) ( Table 1). Other SNPs showing a stronger association than rs7482144 were rs11036474 (HBG2) and rs10128556 located downstream of HBG1 (Table 1). In that context, subjects carrying the CC genotype of rs10128556 had the lowest HbF levels ( Figure 1). SNPs rs11036474, rs2855039 and rs10128556, rs2071348 tagged the same signal (Table 1 and Table S3).
Conditional analysis on rs11036474 and rs10128556 caused rs7482144 to lose its significant association with HbF. However, rs7482144 maintained its significance when it was conditioned on rs67385638. These findings clearly indicate that rs7482144 is not the only variant that causes the robust effect on HbF levels (mean = 16.7±2.1%) in Kuwaiti patients with SCD.
Almost all the SNPs in the HBB locus in this study showed a significant association with HbF levels in the lower (HbF-1) and middle (HbF-2) subgroups. Therefore, identified minor alleles of studied variants in this locus had a negative impact on HbF levels; this was confirmed by the allelic change of beta coefficients. Notably, rs3759071 showed significance solely for the highest subgroup (HbF-3), as confirmed by the allelic change (G) of the beta coefficient (Table 1).
We performed haplotype analysis with the five variants rs10128556, rs11036474, rs7482144, rs72872549 and rs67385638, including a newly-identified SNP, and generated three haplotypes that represent 97.2% of all haplotypes in this locus. 72% of the SCD subjects carrying the TCATG haplotype had higher levels of HbF (mean = 24%); on the other hand, the CTGCC haplotype was associated with lower HbF (mean = 16.2%) levels. Haplotypes identified in this locus explained the phenotypic variation in HbF by 3.1% and 6%, respectively ( Table 2). Figure 2 depicts the main findings of our study, showing the selected variants in QTLs (BCL11A, HBS1L-MYB and HBB) which are strongly associated with HbF levels. Each SNP is displayed with its regulatory effect (green arch: associated with HbF-2 and HbF-3; blue arch: associated with HbF-1) with mean HbF % values.

Discussion
The Gaussian distribution of HbF levels among patients with the AI haplotype gives a unique opportunity to investigate the genomic drivers of HbF expression in SCD. Even though Kuwaiti patients generally have elevated HbF levels, there is still a marked variability, leading us to hypothesize that a variety of genetic modifiers act in a stepwise and probably synergistic manner, to drive HbF expression in this group. In order to investigate the variants associated with different degrees of HbF elevation among our patients, we divided the patients into subgroups according to their HbF levels.
Previous genomic studies have identified several SNPs from QTLs on chromosomes 2p15, 6q23 and 11p16 in association with HbF levels and /F-cell numbers [1,3,13,15]. Many of the reported SNPs likely tag the same genetic signal at each locus since they show moderate to high LD. Indeed, our study confirms that polymorphisms in the BCL11A, MYB-HBS1L, as well as the HBB locus, are associated with HbF levels.
BCL11A is a major regulator of hemoglobin gene switching [16] and a direct repressor of HbF production [17]. Polymorphisms within the 14 kb intron 2 of BCL11A are associated with HbF levels in different populations [3,13,18]. In the present study, BCL11A is the most influential HbF modifier locus, affecting each of the HbF subgroups. We found that rs1427407 in DHS +62 had the strongest association with HbF, especially in the HbF-2 subgroup. The G→T change alters the DNA sequence for a key regulatory element '+62' within the erythroid intronic enhancer for BCL11A [19]. This variant could be causal to HbF association seen with other closely-linked markers in the locus, such as rs766432, rs4671393 and rs1896296. Another SNP, namely rs10195871, was reported to be associated with HbF in several populations [18,20,21], and our study is the first to report this association among Gulf Arabs. Our results clearly indicate that rs10195871 and rs10172646, which were in strong LD, are significant for every HbF subgroup, with the highest significance in HbF-1. Consistent with previous reports [9,18,22], we also found that rs7606173 in DHS +55, is the genetic marker that was strongly associated with HbF, especially in the HbF-1 subgroup.
Our results not only confirm the previously-associated BCL11A SNPs, but also reveal an independent effect of rs1427407 and rs10195871 on HbF regulation. Haplotype analysis showed that this group of SNPs has a stronger impact than individual ones, consistent with the hypothesis [19] that multiple functional SNPs within the composite enhancer act synergistically to affect BCL11A regulation. HbF phenotypic variation of the synonymous variant, rs7569946, within exon-4 in BCL11A, was limited to 1.8%, indicating its unique specificity and significance to the highest HbF levels (HbF-3). Further investigation of this variant is needed to confirm its effect in SCD patients with non-AI β S haplotypes and low HbF levels.
Polymorphisms in the HBS1L-MYB locus are strongly associated with HbF levels among European and Chinese patients with thalassemia and SCD, but are not so significant among African [13,23] and Saudi patients [8]. However, our results showed that SNPs rs66650371, rs9399137 and rs35786788, in this QTL are significantly associated with the "super-HbF expressors" in subgroup HbF-3. The 3-bp deletion rs66650371, located near the erythroid-specific DNase I hypersensitive site 2 within block 2, has previously been shown to be strongly associated with HbF [23]. This site is surrounded by binding sites for erythroid-specific transcription factors such as TAL1/E47, GATA, RUNX1, LDB1 and KLF1, and was proposed to be a major factor contributing to elevate HbF to high levels [23][24][25]. In this regard, the 3-bp deletion is likely of direct functional significance for critical regulatory elements within the core enhancer for MYB, which encodes an important erythroid transcription factor [23,24].
Patients carrying rs66650371 had uniquely high HbF levels, thus making it probably the most functional and independent variant within HBS1L-MYB to elevate HbF to the highest levels. Similar results were reported for rs9399137, which was in complete LD with rs66650371 in African American and Tanzanian patients with sickle cell anemia [26,27]. The phenotypic variance explained by the defined haplotypes did not show higher magnitude than individual SNP effects, indicating that those variants within HMIP act independently to elevate HbF to the extraordinary levels in the Kuwaiti population.
The HBB locus has been extensively studied, and several genetic modifiers of HbF have been detected within the cluster. The XmnI polymorphism, rs7482144, in the proximal promoter of HBG2, tags the AI and SEN haplotypes; its association with HbF levels is well established in different populations [13,16]. The results of our study confirm this association; in addition, we found other variants within this cluster that have an equal or even stronger effect on HbF levels. For example, rs67385638, rs11036474 and rs10128556 were more strongly associated with HbF levels than rs7482144. Conditional analysis showed that rs11036474 and rs10128556 were the most independently-associated variants found in the HbF-1 and HbF-2 subgroups while rs11036474 has not been reported in other populations except in China [28]; our findings on rs10128556 corroborate those of Galarneu et al. [6].
The haplotypes generated by rs10138556, rs11036474, rs7482144, rs72872549 and rs67385638 in the HBB cluster confirmed the negative impact of minor alleles. Indeed, subjects carrying the CTGCC haplotype had 8% lower HbF levels than the ones carrying TCATG. The identified haplotypes explained the phenotypic variation in HbF by 3.1% and 6%, respectively. The observed variances may not reflect the real effect of all synergistic SNPs, which are in strong LD and acting together in a poly-variant manner in the HBB locus. Individual minor alleles of rs10128556-C, rs11036474-T, rs7482144-G, rs72872549-C and rs67385638-C had a relatively lower negative effect than CTGCC haplotype. Additionally, our findings indicated that the group effect of the major alleles of the five variants used for haplotype analysis might have a higher effect of HbF levels tentatively. The effects of the variants reported in this study indicated that SNPs in the HBB locus were specific to the HbF-1 subgroup. This phenomenon was confirmed by the allelic change of beta coefficients.
In chromosome X, we found two SNPs (rs4969549 and rs12559632) that were also reported by other studies [15,18]. However, the significance of these variants did not surpass a p value of 0.005 which suggests that the X-linked factor influencing HbF production is not crucial in patients with the AI haplotype [29].
The independent as well as the synergistic effects of variants in the BCL11A locus show that this region is associated with HbF across the 3 subgroups, while some variants especially in the HBS1L-MYB locus are independent key players, elevating HbF to the highest observed levels. Thus, these "super HbF expressors" probably represent a unique group; the identification of the genetic variants associated with them may provide new therapeutic options for SCD. Since most patients with SCD in Kuwait carry the AI haplotype, the assumption was that their high HbF levels were primarily attributable to rs7482144. However, the results of the present study and others have shown that, even within the HBB, there are other SNPs that are more potent. Validation of our hypothesis requires the study of a large number of patients, which unfortunately are not available in Kuwait. For this reason, collaborative studies involving other countries in the region are underway. The present results are, therefore preliminary, pilot data.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/jpm11060567/s1, Figure S1: Distribution of HbF levels among patients, Table S1: Fetal hemoglobin association results for 24 SNPs at the BCL11A locus in Kuwaiti patients with SCD,  Informed Consent Statement: Patients gave written informed consent or assent as appropriate.
Data Availability Statement: More of the relevant data from the study are provided as supplementary material to this paper, while all data are available on demand from the first author.