Genetic Diversity of Plasmodium vivax Cysteine-Rich Protective Antigen (PvCyRPA) in Field Isolates from Five Different Areas of the Brazilian Amazon

The Plasmodium vivax Cysteine-Rich Protective Antigen (PvCyRPA) has an important role in erythrocyte invasion and has been considered a target for vivax malaria vaccine development. Nonetheless, its genetic diversity remains uncharted in Brazilian malaria-endemic areas. Therefore, we investigated the pvcyrpa genetic polymorphism in 98 field isolates from the Brazilian Amazon and its impact on the antigenicity of predicted B-cell epitopes. Genetic diversity parameters, population genetic analysis, neutrality test and the median-joining network were analyzed, and the potential amino acid polymorphism participation in B-cell epitopes was investigated. One synonymous and 26 non-synonymous substitutions defined fifty haplotypes. The nucleotide diversity and Tajima’s D values varied across the coding gene. The exon-1 sequence had greater diversity than those of exon-2. Concerning the prediction analysis, seven sequences were predicted as linear B cell epitopes, the majority contained in conformational epitopes. Moreover, important amino acid polymorphism was detected in regions predicted to contain residues participating in B-cell epitopes. Our data suggest that the pvcyrpa gene presents a moderate polymorphism in the studied isolates and such polymorphisms alter amino acid sequences contained in potential B cell epitopes, an important observation considering the antigen potentiality as a vaccine candidate to cover distinct P. vivax endemic areas worldwide.


Introduction
Malaria remains an important public health problem in several countries of tropical and subtropical regions of the world. In 2019, the disease caused an estimated 229 million clinical cases and around 409,000 deaths worldwide [1]. Among the Plasmodium species causing malaria in humans, Plasmodium vivax is the most widely distributed and prevalent outside of Africa [2]. In Brazil, endemic regions are restricted to the Legal Amazon, a region that currently accounts for the majority (>99%) of the countrywide malaria burden [3] and where P. vivax is predominant, with approximately 90% of the reported cases [4]. Several examined for species identification. To increase the sensitivity of parasite detection, molecular analyses using specific primers for genus (Plasmodium sp.) and species (P. falciparum and P. vivax) were performed in all the samples as previously described [26]. Donors positive for P. vivax and/or P. falciparum at the time of blood collection were subsequently treated by the chemotherapeutic regimen recommended by the Brazilian Ministry of Health.

Ethical Considerations
The study protocol was approved by the Research Ethics Committee of each locality, which included obtaining the following patients' written consents for research use of their blood samples: Cruzeiro do Sul, Mâncio Lima and Guajará were reviewed and approved by the Fundação Oswaldo Cruz Research Ethics Committee, CEP-Fiocruz CAAE 46084015.1.0000.5248. In addition, the protocol of other blood sample collection was approved by the Research Ethics Committee of each locality: Manaus (CEP-Fiocruz): 346-613; Oiapoque (Hospital Municipal do Oiapoque/AP): 68980-000.

Genomic DNA Extraction
The DNA from 98 blood samples was previously extracted using the QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions and, then, stored at −20 • C until amplification.

Design of Pvcyrpa Specific Primers
The specific primers of pvcyrpa gene (1101 bp) were designed using standard gene sequences of P. vivax Salvador-1 (Sal-1) strain from GenBank NCBI Reference Sequence: XM_001615090.1 (Gene ID: PVX_090240). All oligonucleotides were designed and checked for specificity by using the Primer-BLAST tool provided by the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/tools/primer-blast/, accessed on 13 August 2019) and the design quality of the oligonucleotides was evaluated by OligoAnalyzer v.3.1 (https://www.idtDNA.com/calc/analyzer/, accessed on 13 August 2019) to avoid homodimers and heterodimers (Table 1). The specific primers were chemically synthesized to perform PCR and DNA sequencing. The pvcyrpa gene has a structure consisting of two exons separated by a small well-conserved intron located on chromosome five, encoding for a microneme protein [22,[27][28][29]. Consequently, from the extracted genomic DNA, the two exons of the pvcyrpa gene were amplified separately using two different primer sets, resulting in two gene lengths. Table 1. Polymerase chain reaction (PCR) primers used for the amplification of the pvcyrpa gene.

PCR Amplification of Pvcyrpa Gene
All the pvcyrpa genes reported in this study were amplified by conventional PCR using the two pairs of primers designed and described above. PCR reactions of the pvcyrpa gene were carried out in 25 µL volume that included 3 µL of DNA, 10 pmol/µL of each primer and the Master Mix kit (Promega, Madison, WI, USA) containing Taq DNA polymerase, PCR buffer and 10 nmol of each deoxynucleotide triphosphate (dNTP, Promega, Madison, WI, USA). The conventional PCR reaction was carried out using a GeneAmp PCR system 9700 (Applied Biosystems, Foster City, CA, USA) and the amplification conditions were as follows: for the first exon, one step at 95 • C for 2 min, 30 cycles at 95 • C for 1 min, 56 • C for 1 min and 72 • C for 1 min, and the last step at 72 • C for 1 min. For the second exon, the temperatures and the number of cycles remained the same, except for the annealing temperature, which was 59 • C for 1 min. The PCR conditions for the amplification of PvCyRPA F1/R1 generated the first pvcyrpa gene fragment (600 bp) and the PCR conditions for amplification of PvCyRPA F2/R2 yielded the second pvcyrpa gene fragment (466 bp). In all reactions, two negative controls (one without DNA and the other with DNA extracted from in vitro culture of P. falciparum PSS1 strain) and positive control (P. vivax-infected sample) were used. To confirm the presence of DNA from the in vitro culture of P. falciparum and that the lack of amplification was due to the specificity of the primers for pvcyrpa, we performed the amplification of the P. falciparum p126 gene fragment and electrophoresis as previously described [30]. After PCR, ten µL of amplified products were size-fractionated by electrophoresis within 2% agarose gel (Sigma Aldrich, Missouri, USA) in 1× TAE buffer (0.04 M TRIS-acetate, 1 mM EDTA) in the presence of 1× GelRed nucleic acid stain (Biotium, Fremont, CA, USA). PCR products were visualized by ultraviolet (UV) illumination. The sizing of products was performed using a GeneRuler 100 bp Plus DNA Ladder (Thermo Scientific, Waltham, MA, USA). Then, amplicons were purified using the GE Healthcare Lifesciences kit following the manufacturer's instructions. Afterward, 5-50 ng of DNA was used per sequencing reaction employing the Sanger method, using forward and reverse primers.

DNA Sequencing and Polymorphism Analysis
The specificity of the assay was confirmed by sequencing the PCR products from all positive samples using a Big Dye Terminator Sequencing Kit (Applied Biosystems, Foster City, CA, USA) following the manufacturer's instructions. The DNA sequencing was carried out on the ABI 3730xl DNA analyzer (Applied Biosystems, Foster City, CA, USA) with the support of Fiocruz Genomic Platform and all the results were analyzed using DNASTAR's sequence alignment software [31]. Moreover, the sequences were also analyzed in BioEdit sequence alignment editor to better-visualized SNP positions, employing ClustalW multiple sequence alignment and the Sal-1 strain as a reference sequence.
Multiple alignment Clustal Omega, distance matrix, and the phylogenetic tree were conducted using the MegAlign Pro 15 (Lasergene DNASTAR) program and the circular map of protein alignment was generated using the software GenVision v15 (Lasergene DNASTAR).

Genetic Analysis of the Coding Gene
Genetic diversity of pvcyrpa sequences was analyzed using the DnaSP v6 software [33] to estimate within-population diversity based on the genetic diversity parameters as the number of segregation sites (S), the number of haplotypes (h), haplotype diversity (Hd), and nucleotide diversity (π).
Natural selection in pvcyrpa was assessed by the Tajima's D and Z-test. To test the neutral theory of evolution, Tajima's D values [34] were calculated using the total number of mutations also estimated with DnaSP v.6 software. This test informs about the selection and demographic forces acting on a population. Positive values might be suggestive of positive or balancing selection. This force maintains alleles at balanced frequencies. On the other hand, negative values suggest purifying selection or recent population expansion [34]. The Z-test method was performed with MEGA7 v.6.0; the rates of non-synonymous (dN) to synonymous (dS) substitutions (dN/dS) (1000 bootstrapping replicates) were estimated with Nei and Gojobor's method [35] and with the Jukes and Cantor correction, in which p < 0.05 was considered significant.
Haplotype data also were generated using DnaSP v.6 and the haplotype network was constructed using PopArt v.1.7 with the median-joining algorithm [36] to explore the parasite relationships based on the pvcyrpa gene. Mutational steps represent the connections between haplotypes, and empty squares show the non-sampled or extinct haplotypes. The color of the circles represents the geographic origins of each haplotype, while the size of the circle represents the frequency of each haplotype.

Prediction of Linear B-Cell Epitopes
The prediction of linear B-cell epitopes was carried out using the Ellipro algorithm, as well as confirmed by the overlap between the predictions of at least two more algorithms (BCPred, BepiPred, ABCpred and Emini). This software takes a single sequence in FASTA format input and each amino acid receives a prediction score profile of known antigens and incorporates propensity scale methods based on hydrophilicity and secondary structure prediction. For each input sequence, the server outputs a prediction score. The positions of the linear B-cell epitopes are predicted to be located at the residues with the highest scores. In addition, the software ElliPro predicts linear and discontinuous antibody epitopes based on a protein antigen's 3D structure and accepts two types of input data: protein sequence or structure (PDB format) [37]. This server associates each predicted epitope with a score, defined as a PI (Protrusion Index) value averaged over epitope residues. In the method, residues with larger scores are associated with greater solvent accessibility.

Molecular Characterization of the Pvcyrpa Gene in the Studied Regions
To identify the gene encoding the PvCyRPA in isolates from Brazilian endemic areas, 98 blood samples from P. vivax-infected individuals living in the cities of Cruzeiro do Sul, Mâncio Lima, Guajará, Manaus e Oiapoque had the DNA extracted and subjected to molecular diagnosis by conventional PCR.
The pvcyrpa sequence encodes 1101 bp with the two exons sequence and so it was divided into two regions (exon-1 and exon-2). The primer combinations (PvCyRPA_F1/R1 and PvCyRPA_F2/R2) designed to cover the length of the targeted pvcyrpa gene resulted in amplification of two fragments in 100% of samples ( Figure 1). PvCyRPA F1/R1 primer combination amplified a fragment of 600 bp whereas PvCyRPA F2/R2 primer combination amplified a fragment of 466 bp. Additionally, P. falciparum specimens from in vitro culture were tested for quality assurance, resulting in negative PCR amplification of the pvcyrpa gene ( Figure 1B). Therefore, the 98 samples from individuals infected with P. vivax amplified by PCR were subjected to sequencing reactions to screen the possible single nucleotide polymorphisms of the pvcyrpa gene. All amplified fragments were sequenced and aligned for sequence analysis.

Sal-1 a Substitutions b
Isolates

Population Genetic Analysis
In P. vivax isolates from the Brazilian Amazon, the genetic diversity was heterogeneously distributed between the regions coding the two exons, with higher values for exon-1 comparing to exon-2 among localities (Table 3). Using the entire coding gene, it was shown that Mâncio Lima isolates had the highest nucleotide diversity (π) (0.01272 ± 0.00064), while Manaus had the lowest (0.01116 ± 0.00030). Oiapoque had the highest haplotype diversity (Hd) (1.000 ± 0.052), while Guajará had the lowest (0.833 ± 0.222). Some similar differences were detected when analyzing each exon separately. Concerning exon-1, the highest nucleotide diversity was also observed in the Mâncio Lima group (0.01517 ± 0.00095) among all five populations, while Manaus sequences displayed the lowest nucleotide diversity (0.01285 ± 0.00050). Moreover, parasites from Oiapoque presented the highest estimate of haplotype diversity (0.917 ± 0.092), whereas parasites from Guajará showed the lowest Hd (0.667 ± 0.204). In comparison to exon-2, the highest nucleotide diversity was observed in the Mâncio Lima group (0.00938 ± 0.00077). In contrast, Guajará sequences displayed the lowest nucleotide diversity (0.00792 ± 0.00420) in exon-2. Isolates from Oiapoque presented the highest estimate of haplotype diversity (Hd) (0.889 ± 0.091), whereas parasites from Guajará showed the lowest Hd (0.500 ± 0.265).  We performed the Tajima's D and Z-test to determine whether natural selection was affecting the pvcyrpa gene. The Tajima (Table 3).

Haplotype Network Analysis
The median-joining haplotype network constructed by PopArt 1.7 using the 98 sequences produced a total of 50 haplotypes with some of which consisting of more than one sequence from the Brazilian Amazon ( Figure 3). The haplotype network to explore the parasite relationships based on the pvcyrpa gene and comprising mutations at 34 segregating sites. All 50 haplotypes were found closely related. The haplotypes Hap_1 and Hap_11 had high frequency and shared parasites from all five localities. The haplotype Hap_4 (Cruzeiro do Sul, Mâncio Lima and Guajará) and Hap_8 (Cruzeiro do Sul, Mâncio Lima and Manaus) both shared parasites from three localities. Moreover, the haplotypes Hap_2 (Cruzeiro do Sul and Mâncio Lima) and Hap_9 (Cruzeiro do Sul and Manaus), Hap_14 (Cruzeiro do Sul and Mâncio Lima), Hap_23 (Manaus and Oiapoque) shared sequences from 2 localities. The other haplotypes had sequences from only one location.

Comparison of Amino Acid Variations in PvCyRPA among Genome Sequences Available Worldwide
The PvCyRPA amino acid substitutions identified in genome sequences worldwide, including those from Brazilian Amazon, are resumed in Table 4. As observed in the protein sequence alignments, the PvCyRPA coding gene had an excess of non-synonymous mutations, which were more frequent in exon-1 than in exon-2. In addition, we subsequently aligned the protein sequence of these mutant field isolates with other hypothetical CyRPA proteins derivative from P. vivax genome data available in the GenBank database and also aligned with the isolate Mexico-Southern Mexican [23]. Among a total of 31 in PvCyRPA protein observed in P. vivax sequences worldwide, 26 amino acid substitutions are also present in our isolates. Interestingly, only our isolates showed a new substitution at K150R, being found in the localities of Cruzeiro do Sul, Mâncio Lima, and Guajará. In addition, we can observe a high genetic variability among the isolates of each locality. Curiously, Cruzeiro do Sul and Mâncio Lima present the two variants existing in position Q142 when compared to the genome sequences. Additionally, Cruzeiro do Sul, Mâncio Lima, Manaus and Oiapoque also presented the two variants in position D145, except Guajará. Likewise, some amino acid substitutions of the PvCyRPA protein were found but were rare in only some genomes: L180H in India VII; I63T and Y361H in North Korea; R125T and Q147K in SCO 66052.1 (Sanger Institute) (Table 4). Furthermore, we also compared the consensus sequences of our Brazilian Amazon isolates with reference sequence Sal-1 (PVX_090240) and P01 strain (PVP01_0532400) and generated the circular map of protein alignment using the software GenVision v15 (Lasergene DNASTAR) ( Figure 4A). As expected, a significantly high degree of identity was observed across the sequences analyzed, maintaining the mutations found in relation to reference sequence Sal-1. The analysis showed a high identity among our isolates and P01 strain, despite the deletion of 4 amino acids present in the P01 sequence at positions 13-16 (FLFS). According to pairwise distance, the percent identity ranged from 94.5% (P01 vs. GJ) to 99.7% (CZS vs. OIA) ( Figure 4A,B).

Haplotype Network Analysis
The median-joining haplotype network constructed by PopArt 1.7 using the 98 sequences produced a total of 50 haplotypes with some of which consisting of more than one sequence from the Brazilian Amazon ( Figure 3). The haplotype network to explore the parasite relationships based on the pvcyrpa gene and comprising mutations at 34 segregating sites. All 50 haplotypes were found closely related. The haplotypes Hap_1 and Hap_11 had high frequency and shared parasites from all five localities. The haplotype Hap_4 (Cruzeiro do Sul, Mâncio Lima and Guajará) and Hap_8 (Cruzeiro do Sul, Mâncio Lima and Manaus) both shared parasites from three localities. Moreover, the haplotypes Hap_2 (Cruzeiro do Sul and Mâncio Lima) and Hap_9 (Cruzeiro do Sul and Manaus), Hap_14 (Cruzeiro do Sul and Mâncio Lima), Hap_23 (Manaus and Oiapoque) shared sequences from 2 localities. The other haplotypes had sequences from only one location. Median-joining network of pvcyrpa haplotypes. Each circle represents a unique haplotype and the color of the circles represents the geographic origins of each haplotype, while the size of the circle represents the frequency of each haplotype. Lines separating haplotypes represent mutational steps.

Comparison of Amino Acid Variations in PvCyRPA among Genome Sequences Available Worldwide
The PvCyRPA amino acid substitutions identified in genome sequences worldwide, including those from Brazilian Amazon, are resumed in Table 4. As observed in the protein sequence alignments, the PvCyRPA coding gene had an excess of non-synonymous mutations, which were more frequent in exon-1 than in exon-2. In addition, we subsequently aligned the protein sequence of these mutant field isolates with other hypothetical CyRPA proteins derivative from P. vivax genome data available in the GenBank database and also aligned with the isolate Mexico-Southern Mexican [23]. Among a total of 31 in PvCyRPA protein observed in P. vivax sequences worldwide, 26 amino acid substitutions are also present in our isolates. Interestingly, only our isolates showed a new substitution at K150R, being found in the localities of Cruzeiro do Sul, Mâncio Lima, and Guajará. In addition, we can observe a high genetic variability among the isolates of each locality. Curiously, Cruzeiro do Sul and Mâncio Lima present the two variants existing in position Q142 when compared to the genome sequences. Additionally, Cruzeiro do Sul, Mâncio Lima, Manaus and Oiapoque also presented the two variants in position D145, except Guajará. Likewise, some amino acid substitutions of the PvCyRPA protein were found but were rare in only some genomes: L180H in India VII; I63T and Y361H in North Korea; R125T and Q147K in SCO 66052.1 (Sanger Institute) (Table 4). Furthermore, we also compared the consensus sequences of our Brazilian Amazon isolates with reference sequence Sal-1 (PVX_090240) and P01 strain (PVP01_0532400) and generated the circular map of protein alignment using the software GenVision v15 (Lasergene DNASTAR) ( Figure 4A). As expected, a significantly high degree of identity was observed across the sequences analyzed, maintaining the mutations found in relation to reference sequence Sal-1. The analysis showed a high identity among our isolates and P01 strain, despite the deletion of 4 amino acids present in the P01 sequence at positions 13-16 (FLFS). According to pairwise distance, the percent identity ranged from 94.5% (P01 vs. GJ) to 99.7% (CZS vs. OIA) (Figure 4A,B).    63  69  86  90  93  95  122  125  126  127  129  131  142  145  147  149  150  154  159  170  180  185  187  220  232  260  261  264  287  361 Sal-1 The amino acid variants of the PvCyRPA protein were compared to the reference Sal-1 reference sequence (PVX_090240). • Indicates identical amino acid residues compared to the Sal-1 strain. Codons from 69 to

Polymorphisms and Potential B-Cell Epitopes
We performed in silico prediction for the identification of B cell epitopes present in the PvCyRPA protein using Sal-1 reference and then, seven amino acid sequences were predicted as linear epitopes, ( Table 5). All epitopes were initially predicted by the Ellipro algorithm and were confirmed by the overlap between the predictions of at least two more algorithms (BCPred, BepiPred, ABCpred and Emini). The sequences varied from 9 to 19 amino acids and five sequences were inserted in conformational epitopes. The protein appears to have epitopes mainly in the central region and the C terminal region, while the N region terminal does not contain antigenic sequences. Of note, amino acid polymorphisms were detected in regions predicted to contain residues participating in B-cell epitopes ( Table 6).

Discussion
The invasion of the red blood cell by Plasmodium merozoites is essential for parasite survival and proliferation. The merozoites have therefore evolved multiple pathways, using various antigenic proteins which aid in the invasion process. Among the merozoite's invasive proteins are Cysteine-Rich Protective Antigen (CyRPA), which seems to be essential for the parasite's life cycle during the invasion of erythrocytes and a ligand for reticulocyte invasion [38]. The discovery of the antigen has revamped hope in the search for an effective malaria blood-stage vaccine of P. vivax. However, one of the major obstacles to malaria vaccine development is still the low efficiency of proteins used as immunogens in inducing protection, which, in part, can be explained by genetic polymorphisms [39]. It is important to understand the mechanisms of genetic recombination and sequence variation that represent the repertoire of polymorphic malarial surface antigens and that may help in designing vaccines [29,40]. The genetic diversity of these proteins in hyperendemic areas has been described as a limiting factor for the rapid acquisition of protective immunity and, consequently for the development of an effective vaccine. Furthermore, the antigenic polymorphism of P. vivax vaccine candidates has been little discussed in unstable transmission areas such as the Brazilian endemic regions [41]. Thus, considering that the epidemiology of malaria in Brazil presents unstable transmission and the knowledge about the genetic polymorphism of pvcyrpa remains unknown, we aimed to identify the pvcyrpa gene in isolates from different regions of the Brazilian Amazon and to study the potential impacts of the genetic diversity in potential B-cell epitopes.
The identification and analysis of the genetic diversity of the pvcyrpa gene in isolates from different geographic regions of the Brazilian Amazon have not been previously studied. Considering the distance among the studied localities and the possible existence of a gene flow of Plasmodium vivax genome among the studied populations, associated with migration of people, could promote the gene flow of the parasite [22] and impact the parasite transmission and dispersion [42,43]. Our first results showed that the pvcyrpa gene has high genetic variability in relation to reference sequence Sal-1, presenting 27 polymorphic sites of which one was synonymous and 26 non-synonymous substitutions throughout the sequence. Among these non-synonymous substitutions, two amino acids positions-Q142 (Q142K and Q142R) and D145 (D145G and D145N)-presented one or two variants in our study areas. Overall, R122K (N = 80%; 82%), K131E (N = 77%; 79%), D149G (N = 62%; 63%), A154D (N = 60%; 61%) and E159D (N = 66%; 67%) mutations were the most frequent in our Brazilian Amazon isolates. The analysis of the pvcyrpa gene from the Brazilian Amazon showed that mutations have contributed to generating nucleotide and haplotype diversity. The similarity in the genetic diversity pattern suggests that similar evolutionary forces act on pvcyrpa parasites and that the structural and/or functional properties are consistent. To evade the immune response, genes encoding for antigenic proteins accumulate non-synonymous mutations, which leads to an increase in diversity. In this study, the pvcyrpa gene presented non-synonymous mutation accumulation in parasites of different regions, mainly in exon-1. Significant positive values of Tajima's D indicate balancing selection and population bottlenecks, while negative values suggest the presence of purifying selection or population expansion [34]. Exon-1 had significant positive values in Cruzeiro do Sul, Guajará and Manaus for the Tajima's Test (TjD) as well as in Manaus at exon-2. The results suggest that polymorphism at pvcyrpa exon-1 is generated by mutation and recombination, and is probably maintained by positive balancing selection pressure, which might represent an evolutionary advantage to the parasite. Exon-1 codes for highly variant domains exposed on the surface of infected red blood cells (RBCs), while exon-2 codes for the more conserved segment [29]. Furthermore, the level of genetic diversity in blood-stage antigens seems to be associated with the degree of exposure to the immune system [29,44].
P. vivax biological and genetic characteristics, host immunity, and local vectors may contribute to their different patterns of demographic expansion [45]. Some discrete P. vivax lineages can remain stable across time in one of the areas with the highest malaria transmission in the Americas. Relapses can account for some clonal persistence because P. vivax strains are repeatedly reintroduced in the population as hypnozoites reactivate [46]. Maybe this context can be to explain why only Mâncio Lima and Oiapoque showed no significant positive values of Tajima's D. However, genomic epidemiology approaches can help better to reveal the complex distribution of this parasite in the Brazilian Amazon, as well as the relationships with the worldwide genetic diversity.
Moreover, it was possible to identify 50 different haplotypes of pvcyrpa gene among the 98 P. vivax field isolates from the regions that were analyzed. The haplotype network explores the parasite relationships based on the pvcyrpa gene and comprising mutations at 34 segregating sites, and confirmed the extensive genetic diversity observed in pvcyrpa sequences. All 50 haplotypes were found to be closely related, with some of which consisting of more than one sequence from the Brazilian Amazon. Regarding the pvcyrpa sequences, we observed that haplotypes Hap_1 and Hap_11 had high frequency and shared parasites from all five localities. These findings suggest a global distribution of parasites containing similar pvcyrpa genotypes. Additionally, to compare our findings with the PvCyRPA sequences around the world, we observed that it presents a similar genetic profile among the complete genomes of P. vivax available on the GenBank Database. Among a total of 31 amino acid substitutions of PvCyRPA protein observed in P. vivax sequences worldwide, 26 amino acid substitutions are also present in our isolates.
To develop an effective malaria vaccine that can work in different regions of the world, it is important to include alleles that can induce the immune response and cover the antigenic diversity of P. vivax population. Consequently, the existence of the same haplotypes in different malaria-endemic areas and similar genetic profiles worldwide in their results will be important for the rationale of malaria vaccine designs. Moreover, as the immune system could act as selective pressure and the PvCyRPA is emerging as an alternative antigen in vaccine development, we also evaluated the impact of nonsynonymous polymorphisms in relation to predicted B-cell epitope sequences.
Amino acid variation was present at peptide regions potentially participating in B-cell epitopes, which supports the idea that this molecule is under selective immune pressure. The seven B-cell potential epitopes have been identified in the PvCyRPA protein, most of which are contained in conformational epitopes, which corroborates its potential as an antibody target. A characteristic of malaria blood-stage antigens is their participation in merozoite invasion and immune evasion. Immunogenicity studies and molecular modeling are essential to determine the importance of PvCyRPA as a vaccine candidate. Targeting molecules important for the Plasmodium life cycle might be limited by their antigenic polymorphism or low immunogenicity. Molecular studies provide information about the dynamics of vaccine antigen polymorphisms that can be used to make informed decisions about which parasite alleles to include in vaccine formulations, and to evaluate accurately the efficacy of vaccines tested in malaria-endemic areas [21]. Thus, an effective antigen vaccine should include alleles that induce host immune responses that are sufficiently broad to cover the existing antigenic diversity. Nevertheless, because of the higher genetic diversity of P. vivax compared to P. falciparum, generating a broad cross-reactive immune response against highly polymorphic asexual stage antigens faces even greater challenges [47].

Conclusions
In summary, the present study explored the genetic polymorphism of PvCyRPA in field isolates from distinct endemic areas in Brazil, showing a moderate sequence variation, which could influence the potential B-cell epitopes and, consequently, antibody recognition. Despite the observed amino acid changes in the studied population and sequences worldwide, the potential antibody targets did not seem to be significantly affected. However, due to the paucity of information on PvCyRPA genetic diversity and its potential as a vaccine candidate, more studies are necessary to confirm the impact of PvCyRPA polymorphism in naturally acquired immune response and/or vaccine development.