Genetic Polymorphism and Natural Selection of Apical Membrane Antigen-1 in Plasmodium falciparum Isolates from Vietnam

Apical membrane antigen-1 of Plasmodium falciparum (PfAMA-1) is a leading malaria vaccine candidate antigen. However, the genetic diversity of pfama-1 and associated antigenic variation in global P. falciparum field isolates are major hurdles to the design of an efficacious vaccine formulated with this antigen. Here, we analyzed the genetic structure and the natural selection of pfama-1 in the P. falciparum population of Vietnam. A total of 37 distinct haplotypes were found in 131 P. falciparum Vietnamese isolates. Most amino acid changes detected in Vietnamese pfama-1 were localized in the ectodomain, domains I, II, and III. Overall patterns of major amino acid changes in Vietnamese pfama-1 were similar to those of global pfama-1, but the frequencies of the amino acid changes slightly differed by country. Novel amino acid changes were also identified in Vietnamese pfama-1. Vietnamese pfama-1 revealed relatively lower genetic diversity than currently analyzed pfama-1 in other geographical regions, and suggested a distinct genetic differentiation pattern. Evidence for natural selection was detected in Vietnamese pfama-1, but it showed purifying selection unlike the global pfama-1 analyzed so far. Recombination events were also found in Vietnamese pfama-1. Major amino acid changes that were commonly identified in global pfama-1 were mainly localized to predicted B-cell epitopes, RBC-binding sites, and IUR regions. These results provide important information for understanding the genetic nature of the Vietnamese pfama-1 population, and have significant implications for the design of a vaccine based on PfAMA-1.


Introduction
Despite remarkable reduction in global mortality and morbidity of malaria in recent years, the disease is still a global public health concern. Approximately 229 million clinical cases of malaria, with an estimated 409,000 deaths, have been reported in 2019 [1]. Development of an efficacious vaccine is highly imperative, considering the huge socio-economic impact of global malaria. However, no effective malaria vaccine is commercially available despite extensive global efforts. The emergence and spread of parasites with antimalarial drug resistance are also a great hindrance for the effective control and elimination of malaria [2].
Apical membrane antigen-1 (AMA-1) of the Plasmodium species is a membrane protein consisting of a signal sequence, a cysteine-rich ectodomain, a conserved cytoplasmic region, and a transmembrane region [3]. The ectodomain is further segmented into three distinct domains, domains I, II, and III [3]. This protein is mainly expressed in the electron-dense neck of rhoptries of sporozoites and merozoites, and plays an important function in the invasion of hepatocytes and erythrocytes by contributing attachment of the target cells at the posterior end of Plasmodium parasites [4][5][6][7][8]. AMA-1 is a highly immunogenic protein, and evokes a natural immune response in patients infected with either P. falciparum or P. vivax [9][10][11][12]. Immunization with recombinant AMA-1 elicits antibodies to hinder erythrocyte invasion by the malaria parasite, and confers a protective immune response [11,12]. Therefore, AMA-1 has been recognized as a leading malaria vaccine candidate [13,14]. However, the antibodies induced by AMA-1 recognize either conserved or allele-specific epitopes of AMA-1, resulting in limited protection against distinct alleles [15][16][17]. Despite the less variable genetic diversity of AMA-1 compared to other vaccine candidate antigens, such as circumsporozoite protein (CSP), Duffy-binding protein (DBP), and merozoite surface protein-1 (MSP-1), in the global population, it manifests irrefutable polymorphism [18][19][20][21][22][23][24][25][26][27]. Genetic polymorphism of AMA-1 in global Plasmodium field isolates, and the resulting variants in different geographic areas, are major hurdles in the development of a global malaria vaccine based on this antigen. Therefore, it is important to monitor genetic variations in the vaccine candidate antigen in the global Plasmodium isolates, since accumulated or newly emerging mutations can change the structure of the antigen, making it difficult to design optimized malaria vaccines.
Malaria was one of the most prominent infectious diseases affecting high mortality in Vietnam until the early 1990s, but the malaria burden in the country has been dramatically reduced over the past few decades [1]. Between 2012 and 2018, the cases and deaths of malaria were reduced by 74% and 95%, respectively, in Vietnam. The National Malaria Control and Elimination Program (NMCEP) of Vietnam has aimed for malaria elimination in the country by 2030 [28]. However, the Central Highlands of Vietnam, which are forests or forest edges, is still a high malaria risk area [29], and P. falciparum is a dominant species circulating in the Central Highlands [30,31]. Concerns for the emergence and spread of antimalarial drug-resistant parasites also have increased [32][33][34][35]. In the present study, we analyzed the genetic structure and natural selection of P. falciparum AMA-1 (pfama-1) in P. falciparum isolates from Vietnam to expand our knowledge on genetic variations in the global pfama-1 applicable to the development of a vaccine targeting PfAMA-1.

Blood Samples and Ethics
Blood samples were collected from malaria patients who were infected with Plasmodium falciparum in Dak Lak Province, Central Highlands, Vietnam, in 2019 [31] (Figure 1). Malaria infection was initially diagnosed using microscopic examinations for thick and thin blood smears. Finger prick blood samples from the patients were spotted on filter papers (Whatman 3 mm, GE Healthcare, Pittsburg, PA, USA), air-dried, and kept in individual sealed plastic bags at ambient temperature until use. Prior to blood collection, written informed consent was obtained from all the patients. P. falciparum infections were further validated by polymerase chain reaction (PCR) targeting the 18S ribosomal RNA (rRNA) gene [36,37]. The study protocols were reviewed and approved by the Ethics Committee of the Ministry of Health, Institute of Malariology, Parasitology, and Entomology (IMPE), Quy Nhon, Vietnam (No. 368/VSR-LSDT).

Amplification and Sequence Analysis of pfama-1
Parasite genomic DNA was isolated from the blood spots using QIAamp DNA Blood Kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. Full-length pfama-1 was amplified from the genomic DNA by nested PCR using primer sets and thermal cycles, as described previously [25]. To minimize nucleotide misincorporation into sequences during amplification steps, Ex Taq DNA polymerase (Takara, Otsu, Japan) with proofreading activity was used in all PCR procedures. Each PCR product was analyzed by Genes 2021, 12,1903 3 of 19 electrophoresis on 1% agarose gel, purified from the gel, and cloned into a T&A cloning vector (Real Biotech Corporation, Banqiao City, Taiwan). Each ligation mixture was transformed into Escherichia coli DH5α competent cells, and positive clones were selected by colony PCR with the nested PCR primers [25]. Nucleotide sequences of cloned genes were analyzed by DNA sequencing using M13 forward and M13 reverse primers. Sequencing was also performed with two internal primers (5 -CAGGGAAATGTCCAGTATTTGGTA-3 and 5 -TTCCATCGACCCATAATCCG-3 ) to get confidential sequences corresponding to the central region of full-length pfama-1 [25]. To ensure sequencing accuracy, at least three clones from each isolate were sequenced. Some isolates underwent four-or five-fold sequence coverage to confirm the rare polymorphisms. The nucleotide sequences of Vietnamese pfama-1 were deposited at GenBank (accession numbers MW938322-MW938452).

Amplification and Sequence Analysis of pfama-1
Parasite genomic DNA was isolated from the blood spots using QIAamp DNA Blood Kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. Full-length pfama-1 was amplified from the genomic DNA by nested PCR using primer sets and thermal cycles, as described previously [25]. To minimize nucleotide misincorporation into sequences during amplification steps, Ex Taq DNA polymerase (Takara, Otsu, Japan) with proofreading activity was used in all PCR procedures. Each PCR product was analyzed by electrophoresis on 1% agarose gel, purified from the gel, and cloned into a T&A cloning vector (Real Biotech Corporation, Banqiao City, Taiwan). Each ligation mixture was transformed into Escherichia coli DH5α competent cells, and positive clones were selected by colony PCR with the nested PCR primers [25]. Nucleotide sequences of cloned genes were analyzed by DNA sequencing using M13 forward and M13 reverse primers. Sequencing was also performed with two internal primers (5′-CAGGGAAATGTCCAGTATTTGGTA-3′ and 5′-TTCCATCGACCCATAATCCG-3′) to get confidential sequences corresponding to the central region of full-length pfama-1 [25]. To ensure sequencing accuracy, at least three clones from each isolate were sequenced. Some isolates underwent four-or five-fold sequence coverage to confirm the rare polymorphisms. The nucleotide sequences of Vietnamese pfama-1 were deposited at GenBank (accession numbers MW938322-MW938452).

Nucleotide Sequence Polymorphism and Neutrality Test
Nucleotide and deduced amino acid sequences of pfama-1 were analyzed using Edit-Seq and SeqMan programs in the DNASTAR package (DNASTAR, Madison, WI, USA). Nucleotide sequence polymorphism of Vietnamese pfama-1 sequences was analyzed. The numbers of segregating sites (S), average number of pair-wise nucleotide differences within a population (K), haplotypes (H), haplotype diversity (Hd), and nucleotide diversity (π) were analyzed using DnaSP ver. 5.10.00 [38]. The π value was calculated to estimate step-wise diversity throughout the full-length pfama-1 based on a sliding window of 100 bp with a step size of 25 bp. Non-synonymous (dN) and synonymous (dS) substitutions were estimated and were compared with Z-test (p < 0.05 was considered significant) in the MEGA4 program [39] using the Nei and Gojobori method [40] with Jukes and Cantor correction. Tajima's D value [41], and Fu and Li's D and F values [42] were analyzed using DnaSP ver. 5.10.00 to test the neutral theory of evolution [38]. The recombination parameter (R), which included the effective population size and probability of recombination between adjacent nucleotides per generation, and the minimum number of recombination events (Rm) were analyzed with DnaSP ver. 5.10.00 [38]. Linkage disequilibrium (LD) between different polymorphic sites was analyzed based on the R 2 index using DnaSP ver. 5.10.00 [38].

Sequence Polymorphism of Vietnamese pfama-1
Nested PCR of pfama-1 from 135 P. falciparum samples from Vietnam resulted in successful amplification of pfama-1 in 131 samples. No amplicon was detected in four P. falciparum isolates. The size of the amplified products was approximately 1.9 kbp, and no size variation was detected between and among the amplified products. Nucleotide sequence analysis of the 131 Vietnamese pfama-1 sequences based on pfama-1 from 3D7 reference strain (GenBank accession number: U65407) revealed 116 single nucleotide polymorphisms (SNPs), including 52 synonymous and 64 non-synonymous SNPs. The non-synonymous SNPs induced amino acid substitutions at 52 positions in Vietnamese pfama-1 sequences, resulting in 37 distinct haplotypes of pfama-1 in the amino acid levels ( Figure 2). These amino acid changes were scattered throughout each haplotype of Vietnamese pfama-1, but most of the amino acid changes were detected in domain I (22 positions), domain II (12 positions), and domain III (6 positions). Most amino acid changes were di-morphic (48 positions), but tri-morphic amino acid changes at three positions (H200L/D, E267P/Q, and C320G/W), and a penta-morphic amino acid change at one position (E197D/G/Q/V) were also detected. One amino acid change (D584H) was commonly detected in all Vietnamese pfama-1 sequences (Figure 2). Of the amino acid changes found in Vietnamese pfama-1, amino acid changes at 41 positions were previously identified in pfama-1 of P. falciparum isolates from other geographical areas. However, the other 12 changes at 11 positions (N26D, V37M, Y51C, L55S, C320G/W, N338S, Q352P, Y360F, M374T, K459N, and I504S) were novel and never reported previously in global pfama-1, despite their low frequencies, ranging from 1.5 to 3.8%. Haplotype 23 was the most prevalent haplotype, accounting for 56.5% (74/131), followed by haplotype 34 (6.9%; 9/131), haplotype 12 (4.6%; 6/131), and haplotype 20 (3.1%; 4/131). Other haplotypes were detected in only one or two sequences resulting in low frequencies, respectively. Most amino acid changes were identified in domains I, II, and III. Identical amino acid residues with 3D7 sequences were indicated by dots. Based on these amino acid polymorphisms, Vietnamese pfama-1 classified into 37 distinct haplotypes. These amino acid changes were unevenly scattered in each Vietnamese pfama-1 haplotype. However, few amino acid changes were commonly identified in Vietnamese pfama-1 with high frequencies. Conserved amino acid change identified in all Vietnamese pfama-1 are marked as red. Tri-morphic amino acid changes (H200D/L, E267P/Q, and C320G/W) are presented as green. Penta-morphic amino acid changes (E197D/G/Q/V) are marked as blue. Domains I, II, and III are highlighted as shading with different colors, red, yellow, and blue, respectively.

Amino Acid Polymorphisms in Vietnamese pfama-1 Compared with Global pfama-1
Vietnamese pfama-1 shared similar, but not identical, patterns of amino acid polymorphism with global pfama-1. Most amino acid changes were identified in domains I, II, and III. Identical amino acid residues with 3D7 sequences were indicated by dots. Based on these amino acid polymorphisms, Vietnamese pfama-1 classified into 37 distinct haplotypes. These amino acid changes were unevenly scattered in each Vietnamese pfama-1 haplotype. However, few amino acid changes were commonly identified in Vietnamese pfama-1 with high frequencies. Conserved amino acid change identified in all Vietnamese pfama-1 are marked as red. Tri-morphic amino acid changes (H200D/L, E267P/Q, and C320G/W) are presented as green. Penta-morphic amino acid changes (E197D/G/Q/V) are marked as blue. Domains I, II, and III are highlighted as shading with different colors, red, yellow, and blue, respectively.   V/H/G, N162K, T167K, N173K, M190I, H393R, I435N/T, K483I  November 2021). The global patterns except Vietnamese pfama-1 have been analyzed previously [25], and the previous data were applied to analyze overall patterns. Overall patterns of amino acid changes detected in global pfama-1 were similar, but frequencies of each amino acid change differed by country. Some amino acid changes were characterized by country or continent. PNG, Papua New Guinea; SI, Solomon Islands. Heatmap was generated using Morpheus (https://software.broadinstitute.org/morpheus/; accessed on 16 November 2021). The global patterns except Vietnamese pfama-1 have been analyzed previously [25], and the previous data were applied to analyze overall patterns.

Nucleotide Diversity and Natural Selection of Global pfama-1
The Vietnamese pfama-1 showed a substantially lower K value compared with the global pfama-1 including the Southeast Asian countries, Myanmar and Thailand ( Table 2).
The K values of African pfama-1 were higher than those of Asian and Pacific pfama-1. The Hd values of African pfama-1 were also higher than those of pfama-1 derived from other geographic areas. In contrast to other global pfama-1, the Vietnamese pfama-1 showed sub-  S, positions which show differences (polymorphisms) between related genes; singleton variable sites, sites contain at least two types of nucleotides and occur multiple times; parsimony informative sites, sites contain at least two types of nucleotides but only two of them occur with a minimum frequency of two; K, average number of pair-wise nucleotide differences; H, number of haplotypes; Hd, haplotype diversity; π, observed average pair-wise nucleotide diversity; dN, rate of non-synonymous mutations; dS, rate of synonymous mutations.

Nucleotide Diversity and Natural Selection of Global pfama-1
The Vietnamese pfama-1 showed a substantially lower K value compared with the global pfama-1 including the Southeast Asian countries, Myanmar and Thailand ( Table 2). The K values of African pfama-1 were higher than those of Asian and Pacific pfama-1. The Hd values of African pfama-1 were also higher than those of pfama-1 derived from other geographic areas. In contrast to other global pfama-1, the Vietnamese pfama-1 showed substantially low-level selective pressure, with a dN/dS ratio of 1.250, suggesting minimal positive selection. The π value of pfama-1 from each country differed by country, whereas African pfama-1 showed higher π values than those of Asian and Pacific pfama-1 ( Table 2). Vietnamese pfama-1 showed the lowest π value among the global pfama-1 analyzed. To determine the occurrence and significance of any deviation from neutral evolution, Tajima's D, and Fu and Li's D and F values were calculated. All pfama-1 sequences except Vietnamese pfama-1 showed positive Tajima's D values, indicating balancing selection in global pfama-1 ( Table 2). Both Fu and Li's D and F values also suggested evidence for balancing selection acting primarily on the global pfama-1. However, interestingly, the Vietnamese pfama-1 revealed a negative Tajima's D value, implying selective sweep or purifying selection. A sliding window plot of π revealed that global pfama-1 shared highly similar patterns of π across the sequences (Figure 5a). The highest peak of π was commonly identified at domain I of all global isolates. The cluster 1 of the loop I (C1-L) region was located on the π peak of domain I. A sliding window plot of Tajima's D also revealed that global pfama-1 exhibited a similar pattern of Tajima's D across the gene, with positive values at domains I and III, despite a few differences among global pfama-1 (Figure 5b). However, the Vietnamese pfama-1 showed a different pattern of Tajima' D, with negative values across the gene, even though Vietnam pfama-1 showed comparable patterns of two peaks at domains I and III consistent with global pfama-1.  Vietnamese pfama-1 have been analyzed previously [25], and the previous data were applied to analyze overall patterns.  Islands. The global patterns except Vietnamese pfama-1 have been analyzed previously [25], and the previous data were applied to analyze overall patterns.

Recombination and Linkage Disequilibrium
The estimated minimum number of recombination events between adjacent polymorphic sites (Rm) for Vietnamese pfama-1 was 20 (Table 3). The predicted location of plausible recombination sites were domains I and III, suggesting that meiotic recombination between the sites may have contributed to the genetic diversity of Vietnamese pfama-1. Possible recombination events were also predicted in global pfama-1. The highest R values were detected in African pfama-1 (Ghana and Tanzania), whereas Asian and Pacific pfama-1, except PNG, showed relatively lower R values ( Table 3). The LD index (R 2 ) for global pfama-1 also declined with increasing distance across the gene, suggesting the role of intragenic recombination in the genetic diversity of global pfama-1 ( Figure 6). The R and Rm were estimated excluding the sites containing alignment gaps or those segregating for three nucleotides. The R was computed using R = 4Nr, where N is the population size, and r is the recombination rate per sequence (per gene). n, number of sequences analyzed; Ra, recombination parameter between adjacent sites; Rb, recombination parameter for entire gene; Rm, minimum number of recombination events between adjacent sites. # Cited from [25].
Genes 2021, 12, x FOR PEER REVIEW 11 of 19 Figure 6. Recombination event in global pfama-1. Linkage disequilibrium (LD) plot suggested non-random associations between nucleotide variations in pfama-1 at different polymorphic sites. R 2 values were plotted against nucleotide distance using a two-tailed Fisher's exact test for statistical significance. PNG, Papua New Guinea. The global patterns except Vietnamese pfama-1 have been analyzed previously [25], and the previous data were applied to analyze overall patterns.

Haplotype Network Analysis
In order to analyze the relationships between and among the global pfama-1, a haplotype network was constructed. A dense network with 245 individual haplotypes with Figure 6. Recombination event in global pfama-1. Linkage disequilibrium (LD) plot suggested non-random associations between nucleotide variations in pfama-1 at different polymorphic sites. R 2 values were plotted against nucleotide distance using a two-tailed Fisher's exact test for statistical significance. PNG, Papua New Guinea. The global patterns except Vietnamese pfama-1 have been analyzed previously [25], and the previous data were applied to analyze overall patterns.

Haplotype Network Analysis
In order to analyze the relationships between and among the global pfama-1, a haplotype network was constructed. A dense network with 245 individual haplotypes with complicated relationships was established using 648 global pfama-1 sequences (Figure 7). Haplotype 57 was a predominant haplotype shared by isolates from six Asian and Pacific countries, including Myanmar, Thailand, Philippines, PNG, Solomon Islands, and Vanuatu. Haplotype 72 (H72) was also a major haplotype, with a prevalence of 10.5%, and shared by Pacific populations . Four haplotypes (H71, H73, H79, and H112) were admixed with Pacific populations. H6 and H16 comprised solely African populations. Five haplotypes (H46, H51, H76, H97, and H111) were shared by African populations and Asian or Pacific populations. Interestingly, most Vietnamese pfama-1 haplotypes did not cluster with Asian and Pacific populations, and instead clustered into two separated haplotype groups, which branched from H158 and H188.

Nucleotide Differentiation among Global pfama-1
To further analyze the genetic differentiation and gene flow in global pfama-1, Fst values were analyzed ( Table 4). The Fst values between different geographical pfama-1 populations ranged from 0.00019 (between Ghana and Tanzania) to 0.42009 (between Vietnam and Solomon Islands). Interestingly, the Vietnamese pfama-1 exhibited large genetic differentiation from other global pfama-1. Fst values are represented in the lower left quadrant, and average number of pair-wise nucleotide differences between

Nucleotide Differentiation among Global pfama-1
To further analyze the genetic differentiation and gene flow in global pfama-1, F ST values were analyzed ( Table 4). The F ST values between different geographical pfama-1 populations ranged from 0.00019 (between Ghana and Tanzania) to 0.42009 (between Vietnam and Solomon Islands). Interestingly, the Vietnamese pfama-1 exhibited large genetic differentiation from other global pfama-1.

Association between Natural Selection and Host Immune Pressure
To evaluate selective pressure of host immunity on pfama-1, the genetic polymorphisms in the predicted B-cell epitopes, RBC-binding sites, and IUR regions of pfama-1 were analyzed. Most major amino acid changes were detected in the predicted B-cell epitopes, RBC-binding sites, or IUR regions of pfama-1 (Figure 8a). Of 91 amino acid changes identified in global pfama-1 compared with the 3D7 sequence (GenBank accession No.: U65407), 72 were found at the predicted B-cell epitopes, RBC-binding sites, or IUR regions. Among 51 amino acid changes detected commonly in global pfama-1, 42 were located in the predicted B-cell epitopes, RBC-binding sites, or IUR regions. Twenty-nine less common amino acid changes in global pfama-1 were also detected in the predicted B-cell epitopes, RBC-binding sites, or IUR regions. Eight of eleven predicted B-cell epitopes were polymorphic. Particularly, B-cell epitopes 3, 4, 5, 8, and 10 had major polymorphic amino acid residues with high levels of π (Figure 8b). Tajima's D values for the predicted B-cell epitopes 3, 4, and 8 were positive, whereas the values for the predicted B-cell epitopes 5 and 10 were negative (Figure 8b). Amino acid changes commonly identified in global pfama-1 were mainly detected at the C1-L region, which is localized near the hydrophobic pocket of PfAMA-1 [46], and corresponded to the π peak in domain I. A few less frequently observed amino acid changes in global pfama-1 were detected in the loop II region.

Association between Natural Selection and Host Immune Pressure
To evaluate selective pressure of host immunity on pfama-1, the genetic polymorphisms in the predicted B-cell epitopes, RBC-binding sites, and IUR regions of pfama-1 were analyzed. Most major amino acid changes were detected in the predicted B-cell epitopes, RBC-binding sites, or IUR regions of pfama-1 (Figure 8a). Of 91 amino acid changes identified in global pfama-1 compared with the 3D7 sequence (GenBank accession No.: U65407), 72 were found at the predicted B-cell epitopes, RBC-binding sites, or IUR regions. Among 51 amino acid changes detected commonly in global pfama-1, 42 were located in the predicted B-cell epitopes, RBC-binding sites, or IUR regions. Twenty-nine less common amino acid changes in global pfama-1 were also detected in the predicted Bcell epitopes, RBC-binding sites, or IUR regions. Eight of eleven predicted B-cell epitopes were polymorphic. Particularly, B-cell epitopes 3, 4, 5, 8, and 10 had major polymorphic amino acid residues with high levels of π (Figure 8b). Tajima's D values for the predicted B-cell epitopes 3, 4, and 8 were positive, whereas the values for the predicted B-cell epitopes 5 and 10 were negative (Figure 8b). Amino acid changes commonly identified in global pfama-1 were mainly detected at the C1-L region, which is localized near the hydrophobic pocket of PfAMA-1 [46], and corresponded to the π peak in domain I. A few less frequently observed amino acid changes in global pfama-1 were detected in the loop II region.  [44], and IUR regions [45] [44], and IUR regions [45] are represented by blue boxes, red boxes, and green boxes, respectively. Polymorphic amino acid residues commonly detected in global pfama-1 are marked as bold red. The less commonly detected amino acid changes are shown as bold blue. The C1-L (cluster 1 of loop I, aa: 196-207) in DI, and the loop II (aa: 348-392) in DII are marked with yellow and sky-blue squares, respectively. (b) Nucleotide diversity and natural selection analysis. Nucleotide diversity (π) and Tajima's D (TD) values for each B-cell epitope region, RBC-binding region, and IUR region in global pfama-1 were analyzed using DnaSP program. PNG, Papua New Guinea; SI, Solomon Islands. The global patterns except Vietnamese pfama-1 have been analyzed previously [25], and the previous data were applied to analyze overall patterns.

Discussion
In the present study, we analyzed the genetic polymorphisms and natural selection of Vietnamese pfama-1 in order to understand the genetic nature of Vietnamese pfama-1. Diverse amino acid changes resulted by SNPs were detected in Vietnamese pfama-1, similar to global pfama-1 reported from other countries [25,45,[47][48][49][50]. The 51 major amino acid changes that are commonly detected in global pfama-1 were also observed in Vietnamese pfama-1, but their patterns and frequencies differed with global populations. The majority of the common amino acid changes were localized in domains I, II, and III, supporting that these domains are major regions contributing to pfama-1 polymorphism [25]. Some amino acid changes were unique to pfama-1 of specific countries or continents. In particular, Vietnamese pfama-1 showed different patterns of amino acid changes from global pfama-1. Twelve amino acid changes at 11 positions (N26D, V37M, Y51C, L55S, C320G/W, N338S, Q352P, Y360F, M374T, K459N, and I504S) were unique to Vietnamese pfama-1, but not detected in global pfama-1. By contrast, 11 amino acid changes (D36N/V/H/G, N162K, T167K, N173K, M190I, H393R, I435N/T, K485I, D493A, E581Q, and N589T/K), which were commonly identified in global pfama-1 with varying frequencies, were not detected in Vietnamese pfama-1. Haplotype network analysis also revealed that Vietnamese pfama-1 formed distinct clusters, which were clearly distinguished from other global populations, including pfama-1 from the neighboring Southeast Asian countries of Myanmar and Thailand. These findings suggest that the Vietnamese pfama-1 exhibited distinct patterns of polymorphism and genetic differentiation compared with other global isolates. The F ST value, a measure of population substructure based on an analysis of the overall genetic differentiation among populations [51], also suggested genetic differentiation of Vietnamese pfama-1 from other global populations. Due to the limitations of currently available full-length pfama-1 sequences from diverse geographical origins, the implications of substantial differentiation of Vietnamese pfama-1 from global pfama-1 is currently unclear. Further analysis of genetic polymorphisms of pfama-1 in a larger number of global P. falciparum populations is necessary to appreciate the polymorphic nature and evolutionary linkage of global pfama-1.
The π value of global pfama-1 differed depending on the origin of the isolates. The nucleotide diversity in Vietnam pfama-1 (π = 0.0043) was relatively lower than in isolates from different geographical areas. Lower malaria transmission in restricted areas of Vietnam compared with other endemic countries may contribute to the low-level genetic diversity in Vietnamese pfama-1. However, it is also necessary to consider the global pfama-1 sequences analyzed in this study were obtained from parasites collected at different time points in each country, which do not reflect the π value of the pfama-1 populations in the same period. In fact, the genetic structure of Plasmodium populations in endemic areas or countries changes dynamically over time due to various factors [52][53][54][55][56]. Different sizes of isolates from each country may also affect the π value of pfama-1 in each country. Therefore, a systematic analysis of the global P. falciparum isolates collected in the same period, especially current, may be necessary to delineate the genetic variations and evolutionary trends of global pfama-1. Although the π value of global pfama-1 differed by country, synchronized patterns of π across pfama-1 were recognized in global pfama-1, including Vietnamese pfama-1. A sliding window plot suggested that the nucleotide diversity was unevenly distributed across pfama-1, and high levels of π were similarly observed in domains I and III of global pfama-1, supporting that the domains are the main regions contributing to the genetic heterogeneity of pfama-1.
Natural selection probably affected by the host-immune response and recombination between genetically distinct alleles during meiotic replication in the mosquito midgut have been understood as the two main mechanisms underlying pfama-1 genetic diversity [25,57,58]. Interestingly, the Vietnamese pfama-1 showed different patterns of natural selection compared with pfama-1 from other countries. In contrasts to global pfama-1, which showed positive values of Tajima's D, the Vietnamese pfama-1 revealed a negative Tajima's D value, implying purifying selection. The dN/dS ratios of global pfama-1 were relatively high in all isolates, but the value was much lower in Vietnamese pfama-1, suggesting the absence of strong positive selection. These findings suggested that Vietnamese pfama-1 underwent different natural selection compared with pfama-1 from other countries, probably due to selective sweep or a bottleneck effect. However, regardless of these differences, the sliding window plot revealed that domains I and III exhibited high values of π and Tajima's D across global pfama-1, implying that these regions represent possible dominant targets for natural selection by host immune response. Indeed, most common amino acid changes identified in global pfama-1 were mainly scattered in domains I and III, corresponding to B-cell epitopes 3, 4, 5, 9, and 10. The C1-L cluster, which is located near the hydrophobic pocket of domain I in PfAMA-1 [46], is known to affect the binding capacity of inhibitory antibodies, and thereby mediates escape from PfAMA-1 antibodies induced by P. falciparum infection or vaccine trials [16,45,[59][60][61]. The clusters of amino acid polymorphisms including tri-and hepta-morphic changes scattered in the C1-L region of global pfama-1 suggest strong natural selection, which further contributes to host immune escape [17,61,62]. Meanwhile, loop II, the target of the 4G2 inhibitory antibody [63], was highly conserved in global pfama-1, suggesting that this region might be a vaccine candidate based on PfAMA-1 [25]. Meiotic recombination is also one of the main forces driving allelic diversity of pfama-1 [25,50,57]. Substantial levels of recombination events in pfama-1 derived from other geographical isolates have been reported [25]. Although the Vietnamese pfama-1 revealed a lower level of genetic diversity than the other global pfama-1, a comparable level of recombination events and decline of the LD index R 2 were also detected in Vietnamese pfama-1. These findings suggest that interallelic recombination is another force generating genetic diversity of the Vietnamese pfama-1, and consistent with findings from other geographical areas.

Conclusions
Overall patterns of nucleotide diversity and major amino acid changes in Vietnamese pfama-1 were similar to those seen in global pfama-1. However, the Vietnamese pfama-1 revealed relatively lower genetic diversity than pfama-1 populations from other geographical regions, and showed a distinct genetic differentiation profile. Although evidence for natural selection and recombination, which may contribute to the generation and maintenance of genetic diversity of pfama-1, were also found in Vietnamese pfama-1, it revealed a distinct trend of natural selection compared with other global pfama-1. These results have significant implications for understanding the genetic nature of the Vietnamese P. falciparum population, warranting continuous monitoring of the genetic diversity of global pfama-1 to elucidate the polymorphic nature and evolutionary aspects of pfama-1, and design an effective vaccine targeting P. falciparum populations globally.