Risk Association, Linkage Disequilibrium, and Haplotype Analyses of β-Like Globin Gene Polymorphisms with Malaria Risk in the Sabah Population of Malaysian Borneo

Single nucleotide polymorphisms (SNPs) in the β-like globin gene of the human hosts to the risk of malaria are unclear. Therefore, this study investigates these associations in the Sabah population, with a high incidence of malaria cases. In brief, DNA was extracted from 188 post-diagnostic blood samples infected with Plasmodium parasites and 170 healthy controls without a history of malaria. Genotyping of the β-like globin C-158T, G79A, C16G, and C-551T SNPs was performed using a polymerase chain reaction-restriction fragment length polymorphism approach. Risk association, linkage disequilibrium (LD), and haplotype analyses of these SNPs were assessed. This study found that the variant allele in the C-158T and C16G SNPs were protective against malaria infections by 0.5-fold, while the variant allele in the G79A SNP had a 6-fold increased risk of malaria infection. No SNP combination was in perfect LD, but several haplotypes (CGCC, CGCT, and CGGC) were identified to link with different correlation levels of malaria risk in the population. In conclusion, the C-158T, G79A, and C16G SNPs in the β-like globin gene are associated with the risk of malaria. The haplotypes (CGCC, CGCT, and CGGC) identified in this study could serve as biomarkers to estimate malaria risk in the population. This study provides essential data for the design of malaria control and management strategies.


Introduction
Malaria is a zoonotic disease caused by Plasmodium parasites transmitted by the Anopheles mosquitoes' bites. Five Plasmodium species have been identified to infect humans, including Plasmodium falciparum, Plasmodium vivax, Plasmodium ovale, Plasmodium malariae, and Plasmodium knowlesi. In 2020, malaria cases were estimated to increase to 241 million globally [1]. Although malaria-related deaths have steadily decreased over the last 20 years, they increased by 12% in 2020 compared to 2019, resulting in a mortality rate of 15 deaths per 100,000 population [1]. The increase in malaria deaths recently is linked to the reducing access to health care services ranging from relevant diagnostic and clinical aspects due to the COVID-19 pandemic, especially in the African regions [2]. Since COVID-19 is still outspread endlessly around the globe, it is estimated that malaria deaths will continuously accelerate in the next few years.
Malaysia is also one of the countries significantly impacted by the malaria disease. Approximately 1.3 million of the Malaysian population are at risk of being infected by the Plasmodium parasites, and about 10,000 of them eventually live in active foci areas [1]. A recent study revealed that Sabah, Malaysian Borneo, has contributed to 43.3% of all malaria cases in Malaysia [3]. The fatal P. knowlesi is the most commonly reported species in Sabah. Many studies have been conducted to understand better the effects of this Plasmodium species in Sabah, including risk factors for infection with P. knowlesi and the genetic diversity of P. knowlesi [3][4][5][6].
Variants in a human gene are regularly associated with the risk of certain diseases. For example, the C-158T (rs7482144) single nucleotide polymorphism (SNP) in the 5 region of the hemoglobin subunit gamma 2 gene is linked with higher fetal hemoglobin levels, sickle cell anemia, and thalassemia [7,8]. Interestingly, another SNP in the 5 region of the β-globin gene, the C-551T (rs number is not available), is found as a silencer for the gene's transcription and is associated with a thalassemic phenotype [9]. Furthermore, earlier research has linked the C16G (rs10768683) and G79A (rs33950507) SNPs in the globin gene to modifications in erythrocyte structure and thalassemia illness [10][11][12]. These SNPs are related to the function or structure of the red blood cells. Therefore, it is interesting to study whether they have malaria protection effects since SNPs in the β-like globin gene are most frequently related to malaria disease in several case-control or genome-wide association studies [13][14][15]. These data also indicate that SNPs in the β-like globin gene have a great potential in predicting the risk of malaria disease in human populations.
Sabah, a state with many rural areas and a hotspot for malaria transmissions, reported the highest malaria incidences from 2013 to 2017 compared to Sarawak and Peninsular Malaysia [3]. SNP association in the β-like globin gene with malaria risk in the Sabah population of Malaysian Borneo remains scarce. Thus, this study aims to determine the association, linkage disequilibrium (LD), and haplotype of the C-158T, G79A, C16G, and C-551T SNPs in the β-like globin gene with the risk of malaria in Sabah population of Malaysian Borneo.

Blood Samples Collection and DNA Extraction
A total of 188 post-diagnostic blood samples infected with Plasmodium parasites were collected from the Public Health Laboratory, Kota Kinabalu, Sabah. Among these samples, 134 blood samples were infected with P. knowlesi, 28 were infected with P. falciparum, and 26 were infected with P. vivax, as molecularly validated previously using the PlasmoNex TM multiplex assay [16]. Besides that, 170 blood samples were randomly collected from healthy volunteers living within the Kota Kinabalu areas and without a medical history of malaria for this study. The inclusion criteria for this study were: (i) the subjects were able to donate 3 mL of their blood samples, and (ii) the subjects were able to provide written consent for the study. On the other hand, the exclusion criteria were: (i) the cases were not infected with Plasmodium parasite, and (ii) the controls were not living within the Kota Kinabalu areas and with a history of malaria infection. The genomic DNA was extracted from these blood samples using a previously described method [17]. The study was conducted following the Declaration of Helsinki, and ethical approval was obtained from the UMMC Medical Research & Ethics Committee (reference no.: 709.1).

Polymerase Chain Reaction-Restriction Fragment Length Polymorphism (PCR-RFLP)
A polymerase chain reaction (PCR) master mix consisting of 100 ng of DNA template, 1X of GoTaq ® Flexi Buffer (Promega, Madison, WI, USA), 1 unit of GoTaq ® Flexi DNA polymerase (Promega, Madison, WI, USA), 1.5 mM of MgCl 2 solution, 0.2 mM of dNTP mixture, and 0.2 µM of both forward and reverse primers (Table 1) was prepared separately for each of the SNPs. The mixture was topped-up with sterile distilled water (sdH 2 O) until a final volume of 20 µL. The PCR conditions were set as follows: initial denaturation at 94 • C for 4 min; amplification of 35 cycles at 94 • C for 30 s, annealing temperature according to different primer sets (Table 1) for 30 s, and 72 • C for 45 s; a final extension at 72 • C for 4 min. The PCR products were electrophorized in 2% agarose gel stained with ethidium bromide.   [20] A restriction fragment length polymorphism mixture consisting of 4 µL of PCR products, 2.5 units of restriction enzymes (New England Biolabs, Ipswich, MA, USA) according to the respective SNPs (Table 1), and 1X of CutSmart TM Buffer (New England Biolabs, Ipswich, MA, USA) was prepared. The mixture was topped-up with sdH 2 O to a final volume of 15 µL and incubated overnight at 37 • C. The digested products were analyzed in 2-3% agarose gel stained with ethidium bromide ( Figure S1), and the genotype for each SNP was recorded.

Statistical Analyses
The odds ratio (ORs) and 95% confidence interval (CI) were calculated using the SPSS software ver. 22.0 (IBM, Armonk, NY, USA). The LD and haplotype analyses of the SNPs were determined using the SHEsis software [21]. A p-value less than 0.05 was considered statistically significant.

Association of the β-Like Globin SNPs with the Risk of Malaria
Our findings revealed that the T variant allele in the C-158T SNP was associated with a lower risk of malaria infection in general (OR = 0.48, 95% CI = 0.25-0.92, p = 0.027) and a lower risk of infection by P. knowlesi (OR = 0.45, 95% CI = 0.21-0.95, p = 0.035) ( Table 2). On the other hand, the presence of the variant A allele in the G79A SNP was associated with a 6-fold increased risk of P. falciparum infection (OR = 6.36, 95% CI = 1.25-32.33, p = 0.026) ( Table 3). Subjects who inherited a variant G allele for the C16G SNP had a 0.5-fold lower risk of P. vivax infection (OR = 0.46, 95% CI = 0.24-0.87, p = 0.017) ( Table 4). There was no evidence of a link between C-551T SNP and the risk of malaria (Table 5).

Discussion
Sabah, Malaysian Borneo, had the highest malaria cases and deaths among all states in Malaysia from 2013 to 2017 [3]. Hence, determining the association of genetic polymorphism with the risk of malaria incidence with respect to its Plasmodium species can provide useful information for public health officials to design interventions for malaria cases, aligning with the vision of World Health Organization and the global malaria community in creating a world free of malaria. Several SNPs in the β-like globin gene have been reported to have different malaria susceptibility effects. This study investigated the association of the β-like globin C-158T, G79A, C16G, and C-551T SNPs with the risk of malaria in the Sabah population of Malaysian Borneo.
The C-158T SNP is located upstream of the γ-globin gene, which directly affects the production of hemoglobin F (HbF) [22]. This is correlated with the present study, where individuals who inherited a variant T allele in this SNP were significantly associated with a reduced risk of malaria, especially in P. knowlesi infection. Similarly, a pediatric malaria study reported that the spread of the Plasmodium parasite was retarded by HbF-containing red blood cells [23]. This suggests that the C-158T polymorphism might play a critical role in the variation of HbF levels that influence the protective risk effect against malaria. However, this study solely focuses on the genetic factor with the malaria risk, and the HbF levels of the subjects are not available.
Besides that, the G79A SNP is a mutation at codon 26 of the β-globin gene, which causes the hemoglobin E (HbE) variant [24]. This study observed that the variant A allele of this SNP was associated with a 6-fold increased risk of P. falciparum infection. Interestingly, a previous study has empirically shown that the variant allele created an alternative splicing site, which led to the production of a mutated beta-chain and is associated with a blood disorder, the HbE beta-thalassemia [25]. Despite reports suggesting people with blood abnormalities such as thalassemia can neutralize Plasmodium infection [26], the results of this investigation with the G79A polymorphism showed otherwise. The prevalence of thalassemia is high in Sabah, Malaysian Borneo [27]. As a result, the association between blood diseases and Plasmodium infection protection should be further explored.
The C16G SNP is located in the splicing region of the β-globin gene, and in silico studies revealed that the variant G allele could contain two possible branch sites that might induce a truncated protein [28,29], resulting in lower hemoglobin concentration. Malaria spreads by infecting the hemoglobin, and therefore a lower hemoglobin level logically results in a lower risk of malaria infection. This clearly explains that individuals who carried the variant GG genotype confer a protective risk effect against P. vivax infection by 0.5-fold, as observed in the present study. However, a recent study reported that infants with lower hemoglobin levels are not protected against malaria infection in a Papua New Guinean population [30]. Since the present study only involved adults and the hemoglobin level of the subjects is not assessed, further studies that investigate the risk of malaria in relationship with age differences and hemoglobin levels would be important for a better understanding of this complex interaction.
This study did not observe a significant association between the C-551T polymorphism and malaria risk. Despite its insignificant association, a correlation has been observed between the genotype distribution and malaria incidence. For instance, the Indian population has the highest C-551T polymorphism (79%), whereas the Greece population has the lowest C-551T polymorphism (37%) [31]. Simultaneously, India has a high malaria incidence, while malaria was eradicated in Greece in 1974 [1,32]. This might suggest an occurrence of genetic drift as the malaria incidence is eliminated in populations with a low percentage of C-551T polymorphism. However, the data should be carefully interpreted, since other factors, such as the geographical differences and the prevalence of Plasmodium species and Anopheles mosquitoes, have to be mutually considered.
A recent genome-wide association study on malaria parasites reported that LD and large haplotype blocks from 3.6 kb to 14.0 kb at chromosome 8 of P. vivax isolates were associated with different malaria transmission rates [33]. Therefore, it is interesting to study whether LD and haplotype of SNPs in the β-like globin gene of the human hosts are associated with the risk of malaria. Despite the lack of a perfect LD, association analysis based on the haplotype of this study indicated three significant haplotypes (CGCC, CGCT, and CGGC) with varying levels of correlation to malaria risk. These haplotypes could be used as biomarkers in the Sabah population to estimate malaria risk.
One of the limitations of this study is that the present study only focuses on the genetic aspect, and the hemoglobin levels of the human hosts were not assessed. Therefore, the interplay between the SNPs in the β-like globin gene and hemoglobin levels to the risk of malaria infection could not be estimated. Future studies should consider this limitation in their study designs.

Conclusions
In conclusion, this is the first study to investigate the association of the β-like globin SNPs with the risk of malaria in the Sabah population of Malaysian Borneo. This study suggests that the C-158T and C16G SNPs are protective against malaria infection, while the G79A SNP is an increased risk factor for malaria infection. Several haplotypes (CGCC, CGCT, and CGGC) with different correlation levels to malaria risk were identified, and they could serve as biomarkers for malaria risk estimation in the population. The data of this study could be essential to understanding the hosts' immune responses to malaria susceptibility and the development of antimalarial drugs to treat malaria.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/genes13071229/s1, Figure S1: Restriction enzyme-digested fragments of the β-like globin SNPs that were analyzed in agarose gel.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data of this study are included in tables, figures, and referenced articles.