Assessment of Mixed Plasmodium falciparum sera5 Infection in Endemic Burkitt Lymphoma: A Case-Control Study in Malawi

Simple Summary Plasmodium falciparum(Pf) infection is a risk factor for endemic Burkitt lymphoma (eBL), the commonest childhood cancer in Africa, but the biomarkers of Pf infection that predict this risk are unknown. There is some evidence that the genetic complexity of Pf infection may be a risk factor. In 200 children with versus 140 without eBL in Malawi, this study compared variants of the malaria parasite, focusing on Pfsera5, a gene that codes for malaria protein that an infected person’s antibodies target to suppress the parasite. Multiple Pfsera5 variants, which arise when the parasite is not suppressed, were found in 41.7% of eBL children versus 24.3% of other local children, meaning that eBL risk was increased 2.4-fold with multiple Pfsera5 variants. No specific type of variant was related to eBL risk. Research to quantify malaria parasite variants and to clarify the host immune response needed to control variant infections may yield a test to predict eBL risk. Abstract Background: Endemic Burkitt lymphoma (eBL) is the most common childhood cancer in Africa and is linked to Plasmodium falciparum (Pf) malaria infection, one of the most common and deadly childhood infections in Africa; however, the role of Pf genetic diversity is unclear. A potential role of Pf genetic diversity in eBL has been suggested by a correlation of age-specific patterns of eBL with the complexity of Pf infection in Ghana, Uganda, and Tanzania, as well as a finding of significantly higher Pf genetic diversity, based on a sensitive molecular barcode assay, in eBL cases than matched controls in Malawi. We examined this hypothesis by measuring diversity in Pf-serine repeat antigen-5 (Pfsera5), an antigenic target of blood-stage immunity to malaria, among 200 eBL cases and 140 controls, all Pf polymerase chain reaction (PCR)-positive, in Malawi. Methods: We performed Pfsera5 PCR and sequencing (~3.3 kb over exons II–IV) to determine single or mixed PfSERA5 infection status. The patterns of Pfsera5 PCR positivity, mixed infection, sequence variants, and haplotypes among eBL cases, controls, and combined/pooled were analyzed using frequency tables. The association of mixed Pfsera5 infection with eBL was evaluated using logistic regression, controlling for age, sex, and previously measured Pf genetic diversity. Results: Pfsera5 PCR was positive in 108 eBL cases and 70 controls. Mixed PfSERA5 infection was detected in 41.7% of eBL cases versus 24.3% of controls; the odds ratio (OR) was 2.18, and the 95% confidence interval (CI) was 1.12–4.26, which remained significant in adjusted results (adjusted odds ratio [aOR] of 2.40, 95% CI of 1.11–5.17). A total of 29 nucleotide variations and 96 haplotypes were identified, but these were unrelated to eBL. Conclusions: Our results increase the evidence supporting the hypothesis that infection with mixed Pf infection is increased with eBL and suggest that measuring Pf genetic diversity may provide new insights into the role of Pf infection in eBL.


Introduction
Endemic Burkitt lymphoma (eBL) is an aggressive B-cell non-Hodgkin lymphoma (NHL), first described in African children by Denis Burkitt in 1958 [1]. High incidence of eBL overlaps with holoendemic Plasmodium falciparum (Pf) infection in Africa [2], and it is one of the commonest childhood cancers [3] in countries with high malaria endemicity, including Malawi, Uganda, Tanzania, and Kenya [4]. The eBL incidence is 30-fold higher in these countries than in non-malaria endemic countries [5]. The correlation of eBL with malaria in population-based and individual-level studies, the findings of altered anti-Pf antibody levels in eBL cases compared to healthy children [6][7][8], and the reduced frequency of genetic variants that protect against malaria in eBL cases compared to healthy children [9,10] suggest that malaria infection may be related to eBL risk. Biologically, Pf may increase eBL risk directly by stimulating the polyclonal expansion of B lymphocytes, triggering chromosomal instability in B cells [11,12], or indirectly, by impairing immunologic control of the Epstein-Barr virus (EBV), a known carcinogen for eBL [13].
The biomarkers of Pf related to eBL are unknown. A correlation study of 2602 cases in Ghana, Uganda, and Tanzania with published biomarkers (parasitemia, parasite density, and genetic diversity of parasites) of Pf infection in the same countries [14] was the first to suggest that Pf genetic diversity may be related to eBL, based on a significant association between age-specific patterns of eBL and Pf genetic diversity. This hypothesis was evaluated among 303 eBL cases and 274 controls in Malawi by measuring Pf diversity using a Pf3D7 molecular barcode assay [15]. The results showed a significantly higher mean Pf genetic diversity score in the eBL cases than the controls [16]. Although these two studies represent the only evidence linking Pf genetic diversity with eBL, the results may be valid because they are based both on population-and individual-level designs, and an assessment of Pf diversity using a sensitive and specific molecular assay.
In the present study, we investigated whether the presence of mixed infection at the Pf serine repeat antigen-5 (Pfsera5) locus (on chromosome 2) [17] is associated with eBL in Malawi. Pfsera5 has been identified as a blood-stage vaccine candidate [18,19], in part because antibodies targeting Pfsera5, measured using SE36-a recombinant molecule of PfSERA5 [20]-were protective against malaria and eBL [7]. The potential as a vaccine target also prompted interest in the genetic diversity at this locus. Pfsera5 codes for a 120-kDa precursor ( Figure 1A), which is critical to Pf blood-stage infection (and egress from the parasitophorous vacuole) and is an antigenic target of blood-stage immunity to malaria [21]. The PfSERA5 precursor is processed by removing the signal peptide and the protein trimmed into three fragments: the P47, P56, and P18 kDa domains; P56 is further processed to P50 and P6 fragments [22][23][24][25]. Pfsera5 sequence diversity is introduced by insertions and deletions in the P47 fragment in the protein domain containing the octamer and serine repeats (where protective epitopes are located) [26], and by point mutations in a stretch without repeats [27], but the association of Pfsera5 diversity with eBL is unknown.  Figure 1 in Tanabe et al. [27] by the same authors, but the details are different, so they are appropriate for the message in this paper.

Study Patients
The study was conducted in the Infections and Childhood Cancer study in Malawi [28]. Briefly, children aged 0 to 15 years diagnosed with cancers (eBL, leukemia, Hodgkin lymphoma, neuroblastoma, rhabdomyosarcoma, Ewing's sarcoma, primitive neuroectodermal tumor, and Wilms' tumor) were enrolled at the Queen Elizabeth Hospital in Blantyre, Malawi between July 2005 and August 2010. All children were reviewed by one investigator (EM) to verify clinical diagnosis. Confirmation by histology, cytology, or other laboratory investigations was done when possible. Children with eBL were coded as cases and those with another diagnosis were coded as controls [16]. Children with HIV infection and those with Kaposi sarcoma were excluded. The current study included only children who were Pf-infected based on polymerase chain reaction (PCR) performed on samples taken at the time of diagnosis, before initiating cancer treatment targeting a 519-base pair (bp) segment of the PF07-0076 locus [16].

Ethics Review
Ethical approval was given by the Malawian College of Medicine Research and Ethics Committee (P.03/04/277R) and the Office of Human Subjects Research at the National Institutes of Health (Exempt #: 4742). The parents/guardians of the children gave written informed consent.

Pfsera5 Gene Amplification
The PfSERA5 amplification was performed on residual genomic DNA extracted from coded/masked whole blood samples taken at the time of diagnosis before initiating cancer treatment, using a QIAamp Blood DNA Kit (Qiagen, Inc., Valencia, CA, USA) at Osaka University [16]. The PfSERA5 PCR amplification targeted the ~3.3 kb region in exons II-IV using the specific primers sera5-5F0 and sera5-3R0 (Table S1). The sequence before and after exon I (encoding the signal peptide) is AT-rich and contains poly-A residues, which makes it difficult to successfully sequence; thus, this region was excluded in our analysis. PCR amplification was carried out in a 25-μL reaction mixture containing 0.4 μM each of forward and reverse primers, 0.4 mM of deoxyribonucleotides (dNTP), 0.5 units of KOD FX Neo polymerase (TOYOBO Co., Ltd., Osaka, Japan), 12.5 μL of 2× PCR buffer, and 1 μL of genomic DNA solution. The PCR conditions were as follows: 95 °C for 2 min, 40 cycles of 95 °C for 30 sec, 59 °C for 30 sec, and 68 °C for 4 min. A 2-μL aliquot of the PCR product was used as a template for a second PCR amplification in a 25-μL reaction mixture using the primers sera5-5F3 and sera5-3R2 (Table S1) under the same thermocycler conditions. Samples that were not successfully amplified using this primer set were retested using nested PCR primers targeting half the original length of the two fragments (a 5′ half and a 3′ half). The nested primer set used for the 5′ half fragments were sera5-5F0 and sera5-R0 for the first PCR, and sera5-5F3 and sera5-R0 for the nested PCR; and for the 3′ half fragments, sera5-F1 and sera5-3R0 were used for the first PCR, followed by sera5-F2 and sera5-3R2 (Table S1). The nested PCR conditions were the same as for the fulllength PCR, but the extension time was shortened to 2 min. There were 12 duplicate samples included for quality control. The results were concordant for PCR and PfSERA5 single or mixed infection in nine samples at the original testing, and three samples were resolved after re-testing; two were PCR-negative and one was a single infection.

Pfsera5 Nucleotide Sequencing
The PCR products were purified using a QIAquick PCR Purification kit (QIAGEN, Hilden, Germany) according to the manufacturer's instructions. Purified DNA fragments were eluted in 30 μL of Tris-EDTA buffer (TE). The optical density was measured with a NanoDrop (Thermo Fisher Scientific, Wilmington, DE, USA) and the DNA concentration was adjusted to 0.026 μg/μL for the 3.3 kb fragments and 0.013 μg/μL for the 5′-half and 3′-half fragments using TE. At this concentration, 1 μL was suitable for performing one sequencing reaction. Pfsera5 DNA sequencing was performed directly using The BigDye ® Terminator v3.1 Cycle Sequencing Kit and 3130xI Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). Sequencing primers were designed to cover target regions in both directions (Table S1). The nucleotide sequences generated during the current study are available through public database such as DDBJ and NCBI (Accession numbers: LC606291-LC606405). The sequence results without overlapping peaks on the electropherograms were interpreted as being single parasite infections; otherwise, the sample was recorded as having a mixed parasite infection. The PfSERA5 sequence variations in single parasite infections were analyzed to get insights into variation in populations.

Sequence Alignment and Pf Population Genetics Analyses
The PfSERA5 nucleotide sequence obtained from each sample was aligned to the PfSERA5 3D7 strain using CLUSTALW implemented in GENETYX ® ver. 15 (GENETYX Co., Ltd., Tokyo, Japan) and the alignment was manually inspected to ensure alignment accuracy. Analyses were done for the entire sequence and for specific protein domains: the 2562 bp nucleotide sequence includes a stretch without repeats (a non-repeat region, or NonR); the stretch of octamer repeats (OctR) and the stretch with serine repeats (SerR) correspond to amino acid positions 87-193 and 251-997, respectively [27]. The sequence information from the NonR was translated into its 854 amino acid (aa) sequences to identify nonsynonymous (variations that result in changes in an aa sequence) versus synonymous (no change in an aa sequence). To gain insights into Pfsera5 variations in samples from nearby Tanzania or countries with a high eBL incidence (e.g., Ghana), we accessed the Pfsera5 sequence from 55 asymptomatic donors in the Rufiji River Delta in Tanzania sampled in 1993, 1998, and 2003 (Accession numbers: AB634928-AB634982) [29], and from 33 children in three villages in Ghana sampled in 2004 (Accession numbers: AB634983-AB635015) [29].
Haplotype diversity (Hd) and nucleotide diversity (π, the average number of nucleotide differences per site between two sequences in all possible pairs in the sample population) were calculated using DnaSP v5.10.01 [30]. The difference between the numbers of synonymous substitutions per synonymous site (dS) and of nonsynonymous substitutions per nonsynonymous site (dN) was calculated by the Nei and Gojobori method [31], implemented in MEGA X, with the Jukes and Cantor correction. The statistical significance of the difference between dN and dS was estimated using the MEGA Z-test [32]. Higher dS may suggest that active, synonymous amino acid substitutions that are neutral or can boost parasite fitness are more likely to be retained or accumulated, while nonsynonymous amino acid substitutions that reduce parasite fitness are removed or reduced in frequency. Wright's fixation index Fst was calculated to assess the genetic differentiation of Pfsera5 among the parasite circulating in the study population [33]. The pairwise Fst between parasite populations was calculated using Arlequin v3.5 [34]. A small Fst indicates that the parasite allele frequencies within the study populations are similar, whereas a large Fst indicates that the allele frequencies are different, and the study populations do not share genetic diversity.

Statistical Analyses
Patterns of Pfsera5 PCR positivity, mixed Pfsera5 infection, and Pfsera5 haplotypes and nucleotide diversity in eBL cases or controls, separately or combined/pooled, were analyzed for the entire Pfsera5 sequence and by domain (OctR, SerR, and NonR). The results were categorized into known haplotypes; the many "rare haplotypes" (defined among Malawi controls as SerR haplotypes observed in <10% or OctR haplotypes observed in <20% of the samples) were grouped and then compared to the common haplotypes. This grouping is post hoc, but it is useful for exploring pairwise comparisons of common versus rare haplotypes in eBL cases and controls, as well as haplotype patterns observed in Ghana and Tanzania. Differences across sample groups were evaluated using Fisher's exact test. Because age is an important predictor of exposure and immunity to malaria, the data were explored by age (<6 versus ≥6 years of age).
The association of mixed PfSERA5 infection with eBL was assessed by calculating odds ratios and 95% confidence intervals (OR 95% CI) using logistic regression. Bivariate ORs (bORs) were obtained by adjusting separately for age, sex, and the Pf genetic diversity score as confounders [16], before including all these variables into one model for multivariate adjustment (aOR), but the results were not adjusted for multiple comparisons; thus, two-sided p-values < 0.05 were considered statistically significant. However, other comparisons must be interpreted cautiously for hypothesis generation.

Characteristics of Study Subjects
The current study evaluated 341 Pf PCR-positive participants. One participant was >17 years and excluded from subsequent analyses. The remaining participants included 200 eBL cases and 140 controls (Table 1). Most participants were from Malawi (n = 327), but 13 (10 eBL cases and 3 controls) were from Mozambique, a neighboring country. Because of the small sample size of the children originating in Mozambique, the sequence data from those children were analyzed together with those from Malawi, where all children were enrolled. The three leading diagnoses among the controls were renal tumors (n = 28), non-eBL lymphoma and leukemia (n = 35, including 9 with leukemia), and soft tissue tumors (n = 19). The eBL cases were slightly older than the controls, but the difference was not statistically significant (7.2 years versus 6.9 years, p = 0.44). However, the eBL cases were more likely to be aged 6-10 years than the controls (50.5% versus 25.7%, heterogeneity <0.001). The male-to-female proportion of eBL cases and controls was not statistically different in this set (62.5% versus 58.6% in females, p = 0.50).  Pfsera5 PCR positivity was detected in 108 (54.0%) eBL cases and 70 (50.0%) controls ( Table 1). The PCR product was insufficient in one control sample and cannot be conclusively ascertained if the sample is a single or mixed Pfsera5 infection. Likewise, not all Pf positive samples were suitable for amplification of the near full-length Pfsera5, due mainly to the quality of samples available for the study.

Pfsera5 Haplotype Diversity in eBL Cases and Controls in Malawi
Analysis of Hd and π were calculated for 115 samples in Malawi and 88 individuals in Tanzania and Ghana with single Pfsera5 infection. As shown in Figure 1B, the parasites harbored variants at 29 nucleotide positions in the sequenced 2562 bp NonR region; 17 led to nonsynonymous amino acid changes (highlighted orange) and 12 led to synonymous amino acid changes (highlighted green). The numbers of haplotypes are shown in Table  2. Analysis of the entire Pfsera5 sequence yielded 96 haplotypes (88 haplotypes at the amino acid sequence level), 34 in the OctR, 42 in the SerR, and 26 in the NonR regions ( Table 2, detailed in Figures S1 and S2). The Hd for the entire Pfsera5 sequence was 0.993 in eBL cases, 0.996 in the controls, and 0.991 and 0.996 in the samples from Tanzania and Ghana, respectively (Table 2). When Hd was considered by Pfsera5 regions, it was lowest for the NonR region and moderately low for the OctR region; Hd was similar for the entire Pfsera5 and the SerR region, which bears the protective epitopes of PfSERA5 [26]. Hd in the NonR region was low in Tanzania (0.362) and Ghana (0.472) compared to that in the eBL cases and controls combined (0.545) (Malawi and Mozambique).

Patterns of Common Versus Rare Pfsera5 Haplotypes in eBL Cases and Controls
In analyses restricted to those with a single Pfsera5 infection, the frequency of the common haplotypes in the OctR and SerR regions was similar in children with eBL and controls. The common haplotypes in the OctR and SerR regions were more frequent in the combined eBL cases and the controls in Malawi and Mozambique than in Ghana and Tanzania (Figure 2, Figures S1 and S2, p < 0.05).  Table 3 shows nucleotide diversity (π) in the 2562 bp NonR and Table 4 shows the location of the 29 polymorphic sites (12 synonymous and 17 non-synonymous). There are 23 new variations, that is, variations not previously reported, while 6 (5 non-synonymous and 1 synonymous) have previously been reported [21], but the specific variations were not different between the cases and controls. The eBL cases, controls, and individuals in Tanzania and Ghana had similar nucleotide diversity (π). However, the current eBL cases and controls (combined, Malawi and Mozambique) had dS > dN while the opposite pattern (i.e., dN > dS) was observed in the samples from Tanzania (p < 0.06) and Ghana (p = 0.0034) ( Table 3). Fst analysis suggested genetic differentiation (p < 0.05) in the OctR and SerR, but not the NonR regions for parasites in the eBL cases and controls (combined) and in individuals in Ghana (Table 5). However, Fst analysis did not support evidence of genetic differentiation between parasites in Malawi and Mozambique versus those in Tanzania and Ghana.   Common eBL Case, n = 63

Discussion
We investigated the hypothesis that Pf genetic diversity, based on the detection of mixed Pfsera5 infection, is associated with eBL in Malawi [16]. Our findings of 2.40-fold higher odds of mixed PfSERA5 infection in eBL cases than controls, which remained statistically significant after the adjustment of confounders, including gender, age group, Pf DNA copy, and Pf genetic diversity [16], led us to reject the null hypothesis. These results strengthen the evidence that Pf genetic diversity may be related to eBL, as highlighted in our earlier ecological study in Uganda, Tanzania, and Ghana [14] and in our case-control in Malawi using a sensitive molecular assay [16].
There is accumulating evidence that mixed Pf infection may be a molecular surrogate of chronic asymptomatic Pf infection in high malaria transmission areas [35]. The consequences for malaria risk appear similar to earlier studies, that is, the risk for malaria (infection and symptoms) is high in children below age 5 [36,37] but low in children above age 5 [38], consistent with the level of acquired immunity against malaria at those ages [39]. Thus, in older children, mixed infection appears to be a molecular marker of asymptomatic infection [35], incomplete clearance of parasites [40], clinical and parasite immunity [41], and, according to our results, eBL risk. A recent finding of a median of the multiplicity of infection of six haplotypes in infective mosquitoes suggests that mixed infections may arise from co-transmission of genetically diverse oocysts from a single mosquito bite [35]. Super-infection, that is, the inoculation of different parasites from different mosquitoes, is also possible [42] in partially immune people living in malarious, non-sheltered environments [43].
Chronic, asymptomatic, low-density parasite infection has been reported in eBL cases and may precede the disease [40]. Asymptomatic infection is due, in part, to antibodies that target parasite proteins, particularly those that are secreted and embedded in infected red blood cell (iRBC) membranes, thereby blocking sequestration and promoting splenic clearance of iRBCs [44][45][46], and thus suppressing parasite density and clinical symptoms. However, because asymptomatic infections are sub-clinical, they remain untreated for prolonged periods; the associated chronic immune response may lead to eBL as a rare complication in children unsheltered from malaria [8]. While a strong immune response may explain the reduced frequency of symptomatic malaria and lower parasite densities in eBL cases than other children in the same region [40], we speculate that Pfsera5 may contribute to the molecular camouflage that facilitates a tolerance of low-grade infection and sets the stage for progression to cancer. PfSERA5 N-terminal domain decorates the surface of the merozoite where it tightly binds to a host serum protein, vitronectin, which in turn binds to another host protein that camouflages the parasite from the host immune system [47], potentially reducing pressure for mutations in the Pfsera5 gene. However, this immune response is not completely effective, resulting in immune tolerance, high antibody titers against PfSERA5, and low-grade parasitemia [20]. There is still a dearth of understanding in the chronic nature of malaria infection and how it amplifies or down-regulates the immune response, although the continuous insult by low-grade, often genetically complex variants (mean of 4) [35] likely increases the risk of genetic instability in B cells and their progression to eBL [11]. In agreement with this hypothesis, the spleen is frequently enlarged in children with asymptomatic Pf infection [48], whom we hypothesize are the population at risk of eBL [40]. Plausibly, a robust B-cell response promotes survival against malaria [49], on one hand by suppressing Pf positivity in eBL [40,43,50], while promoting eBL risk in children with asymptomatic infection, on the other hand.
The use of Pfsera5 molecular data expands our knowledge of variants in Pf field isolates in countries where eBL incidence is high [4]. Consistent with previous reports [27], we observed that Pfsera5 is highly conserved with limited evidence of genetic differentiation, consistent with the frequent gene flow in Africa. We also observed that Pfsera5 diversity was not unique to Malawi and Mozambique or to eBL cases. These results are reassuring because our comparisons are based on samples obtained using varied, potentially biased methods. The samples from Ghana and Tanzania are from specific communities living in relatively small geographical areas, whereas the Malawi and Mozambique samples are from a much larger geographical area. Thus, the observations that no specific Pf variants or haplotypes are associated with eBL and that both eBL cases and controls in Malawi harbor common haplotypes are consistent with both groups being exposed to similarly high environmental pressure of Pf infection [35]. However, we note that rare haplotypes may be harder to detect because they exist at lower submicroscopic levels. Our findings of a lower frequency of the common haplotypes in Tanzania and Ghana than in Malawi and Mozambique may be due to geographic confounding, as noted above.
Our research suggests that molecular markers of Pf genetic diversity may complement the search for biomarkers of Pf infection related to eBL risk. The general finding that PfSERA5 does not show antigenic variation [20] suggests that Pfsera5 might be a good tool to characterize the genetic complexity of parasite populations in eBL cases and controls. However, this should be balanced against the likelihood that the sample size will shrink because amplifying about 3.2 Kb of Pfsera5 requires higher quality gDNA than amplifying the 519 bp that we used to confirm Pf infection in all our samples. This explains why our sample size shrunk by 50%. Other limitations of our study include the use of cross-sectional samples from a case-control study, and the testing of one sample from a single time point, which underestimates parasite diversity [51]. These limitations are balanced by our hypothesis-driven approach, our focus on diversity at a locus that is an antigenic target of blood-stage immunity to malaria, and the use of cancer controls, in whom referral bias and reverse-causality biases are minimized. Our study included only children who were confirmed to have Pf infection by PCR [16], which increases the validity of our assumption that all children were exposed to infection.

Conclusions
Our study provides new evidence for the association of mixed Pf infection with eBL risk in Malawi. Further research utilizing molecular markers of Pf infection may lead to the discovery of biomarkers of eBL, severe malaria, and/or asymptomatic infection.

Supplementary Materials:
The following are available online at www.mdpi.com/2072-6694/13/7/1692/s1: Figure S1. Allele distribution in sera5 octamer repeat (OctR) region, Figure S2. Allele distribution in SERA5 serine repeat (SerR) region, Table S1. Primers for PCR amplification and sequencing, Table S2. Odds ratio (OR) and 95% confidence interval (CI) for eBL case status in univariate and multivariate logistic regression models.  Informed Consent Statement: Written informed consent was obtained from the parents or guardians of the children.

Data Availability Statement:
The Pfsera5 sequence data has been uploaded to DDBJ (Accession numbers: LC606291-LC606405). The remaining clinical data and the code used for these analyses are available upon reasonable request from the corresponding authors.