Genetic Sequence Variation in the Plasmodium falciparum Histidine-Rich Protein 2 Gene from Field Isolates in Tanzania: Impact on Malaria Rapid Diagnosis

Malaria rapid diagnosis test (RDT) is crucial for managing the disease, and the effectiveness of detection depends on parameters such as sensitivity and specificity of the RDT. Several factors can affect the performance of RDT. In this study, we focused on the pfhrp2 sequence variation and its impact on RDTs targeted by antigens encoded by Plasmodium falciparum histidine-rich protein 2 (pfhrp2). Field samples collected during cross-sectional surveys in Tanzania were sequenced to investigate the pfhrp2 sequence diversity and evaluate the impact on HRP2-based RDT performance. We observed significant mean differences in amino acid repeats between current and previous studies. Several new amino acid repeats were found to occur at different frequencies, including types AAY, AHHAHHAAN, and AHHAA. Based on the abundance of types 2 and 7 amino acid repeats, the binary predictive model was able to predict RDT insensitivity by about 69% in the study area. About 85% of the major epitopes targeted by monoclonal antibodies (MAbs) in RDT were identified. Our study suggested that the extensive sequence variation in pfhrp2 can contribute to reduced RDT sensitivity. The correlation between the different combinations of amino acid repeats and the performance of RDT in different malaria transmission settings should be investigated further.


Introduction
Malaria control and elimination largely depend on prompt and accurate diagnosis for effective treatment [1]. Since its inception in the early 1990s, point-of-care diagnosis proved to be reliable in malaria diagnosis in most parts of the world [2,3]. There has been a steady rise in demand and supply of test kits over the last 20 years [4]. There were approximately 348 million malaria rapid diagnostic test kits sold in 2019 by several companies [5]. The sub-Saharan African region (SSA) received about 80% of all RDT kits globally distributed, with more than 25 million (7%) of those kits distributed in Tanzania [5].
There is increasing evidence of Plasmodium falciparum lacking the hrp2/3 gene, enabling it to evade detection by HRP2-based RDTs. A study from Eritrea indicated that pfhrp2/3 deletions are prevalent at 80.8% and 92.3%, respectively, and that prompted the switch to non-HRP2 RDTs [6]. Studies conducted in Tanzania have shown no pfhrp2/3 deletion in some areas [7,8], but a low percentage has been reported in other parts of the country [9,10]. In regards to pfhrp2/3 deletions, false positivity by RDT is a challenge that can result in the underestimation of the deletions.
The diagnostic coverage of RDT in Tanzania is around 90% in public and private health facilities replacing microscopy, which is only used in about 10% of all health facilities [11].
PFHRP2 is a 60-105 kDa water-soluble protein secreted by P. falciparum trophozoites and schizonts [15][16][17]. Approximately 2 hours after an infection, it is synthesized and secreted in the human host [18]. Gene encoding for this subtelomeric protein is located at positions 1374236 to 1375299 on chromosome 8 [19]. pfhrp2 has a length of 1063 bp and consists of two exon (coding) regions and an intron (non-coding) region. The gene is flanked by four upstream and three downstream microsatellites [20,21].
The pfhrp2 subtelomeric coding region is prone to chromosomal rearrangements with nine gene breaking points, making it highly polymorphic [22]. A large region of tandem repeats within the pfhrp2 sequence encodes a polypeptide containing histidine, alanine, and aspartic acid. RDT detection panels include monoclonal antibodies (MAbs), which target specific HRP2 antigen epitopes [16,20]. There are about 13 major epitopes targeted by different monoclonal antibodies impregnated in the flow panel of RDT cassettes [23,24]. Detection sensitivity correlates well with the frequency and abundance of epitopes present in the sample. With the amino acid repetitive rearrangement in the pfhrp2 region, partial epitopes can exist that are less reactive with capture antibodies than full-length epitopes [23].
A previous study by Baker et al. [20] classified the amino acid sequence of PfHRP2 into 24 repeat types. Type 2 (AHHAHHAD) and type 7 (AHHAAD) occur in high frequency (100%), and type 2 is associated with the basic function of the protein [25][26][27]. Based on the frequency of types 2 and 7 repeats, a prediction regression model was developed to estimate the sensitivity of RDT kits [28]. The model predicted that with parasitaemia ≤ 250 parasites/µL and the function of frequency between types 2 and 7 < 43, HRP2-based RDT will fail to detect P. falciparum [28]. However, the model could not be reproduced 5 years later when its prediction did not match the WHO lot testing results set at >200 parasites/µL [27]. Several studies have shown that the sequence variation in pfhrp2, which leads to extensive epitope modification, might affect the performance of RDTs [24,28].
In light of pfhrp2 deletions and sequence variations [9,29], the WHO recommends the systematic surveillance of RDT performance in areas with a high coverage of HRP2based test kits [30]. This study investigated the natural amino acid sequence variation in P. falciparum field isolates to assess the performance of RDTs.

Study Areas and Samples
The samples used in this study were collected during community-based cross-sectional surveys in the long rainy season between April and June 2018 in Handeni and Moshi, north-eastern Tanzania. Community sensitization and engagement were carried out, and only participants who voluntarily consented to participate were enrolled. Handeni is characterized as a moderate-high malaria transmission area, whereas Moshi is a low malaria-endemic area [31,32].

Plasmodium Falciparum Detection
Dried blood samples were shipped to The London School of Hygiene and Tropical Medicine (LSHTM), where DNA extraction was carried out using a robotic DNA extraction system (Qiasymphony, QIAGEN, Hilden, Germany) [10,29,33]. A nested polymerase chain reaction (PCR) using specific primers for P. falciparum amplifying a fragment of 206 bp was performed as described elsewhere [34].

Sequence Data Analysis
We used Geneious (Biomatters, San Diego, CA, USA) to conduct sequence analysis, including DNA quality check and translation into amino acid. Repeat pattern frequency and sequence length were analysed using R studio.

Statistical Analysis
Samples with parasitaemia of more than 1000 p/µL were used for this analysis. HRP2-RDT sensitivity prediction was performed following the model developed by Baker et al. [28]. Four categories were established based on the score of the function of the frequency of types 2 and 7. HRP2-RDT will be very sensitive if the score of types 2 and 7 frequencies is >100, sensitive if the score is 50-100, borderline if the score is 44-49, and non-sensitive if the score is <43 [36].
Data were entered and analysed using SPSS version 20 (SPSS Inc. Chicago, IL, USA) and the computer program Excel (Microsoft Office Excel 2016). Results are presented in tables and graphs as absolute numbers (N) and percentage values (%). The median amino acid (aa) length was compared using the non-parametric Mann-Whitney U test. The median frequencies of aa were compared using Fisher's exact test since the expected values were less than 10. A p-value less than 0.05 was considered significant.

Results
The present study showed a sequence analysis of exon 2 of pfhrp2 of 39 P. falciparum field isolates from Tanzania ( Figure 1). The results of Plasmodium species identification in the study area have already been published elsewhere [24], and samples that were positive by RDT and microscopy (parasitaemia > 1000 p/µL) and identified as P. falciparum were selected for the direct sequencing. However, we were able to generate high-quality sequences in the samples from Handeni only probably due to the low levels of parasitaemia in the samples from Moshi. The amino acid classification was carried out following the classification developed by Baker et al. [27]. Out of 24 amino acid repeat types, 15 were identified in this study, of which types 2 (AHHAHHAAD), 4 (AHH), and 7 (AHHAAD) were present in a high frequency (>89%) and abundance in all 39 samples. Types 10 (AHHAAAHHATD), 12 (AHHAAAHHEAATH), and 15 (AHHAHHAAN) were present in low frequency (2.6%) ( Table 1).  The amino acid classification was carried out following the classification developed by Baker et al. [27]. Out of 24 amino acid repeat types, 15 were identified in this study, of which types 2 (AHHAHHAAD), 4 (AHH), and 7 (AHHAAD) were present in a high frequency (>89%) and abundance in all 39 samples.

Distribution of PfHRP2 Amino Acid Repeats in Tanzania
Our analysis of repeat amino acid sequence was compared with a previous study conducted in Tanzania in 2010 [27], and both studies analysed 39 samples. In about seven of the 24 types presented between the two studies, the mean number of amino acid repeats significantly differed (p < 0.05), whereas type 2 (AHHAHHAAD) more frequently occurred in all samples than the other types in the current study (Table 2).

HRP2-RDT Sensitivity Prediction in Detecting P. falciparum in Tanzania
RDT insensitivity was estimated to be 69% in detecting P. falciparum in the samples analysed using the Baker predictive model and sensitivity classification. The overall predicted sensitivity was 28%, and only 3% of the samples fell into the borderline sensitive group (Table 3).

Distribution of "Non-Baker" Amino Acid Repeats
The most prevalent types were ADA and HAAD occurring at 100% in all samples. Types AHHADY, AAAD, and AHHAY were the least prevalent (2.6%) (Figure 2).

Distribution of "Non-Baker" Amino Acid Repeats
The most prevalent types were ADA and HAAD occurring at 100% in all samples. Types AHHADY, AAAD, and AHHAY were the least prevalent (2.6%) (Figure 2).

RDT Major Epitopes in Tanzania
There are about 13 major antigenic epitopes in PfHRP2 that are targeted by different classes of monoclonal antibodies (Mab) in HPR2-based RDTs. In the current study, 11 of the 13 (85%) were present. Epitopes such as DAHHAHHA, AHHAADAHHA, and AHHAADAHH that are targeted by 3A4/PTL-3, C1-13, and S2-5-C2-3 MAbs, respectively, were present in all samples (100%). Epitopes DAHHVADAHH and AAYAHHAHHAAY were not present in the field isolates in this study (Figure 3).

RDT Major Epitopes in Tanzania
There are about 13 major antigenic epitopes in PfHRP2 that are targeted by different classes of monoclonal antibodies (Mab) in HPR2-based RDTs. In the current study, 11 of the 13 (85%) were present. Epitopes such as DAHHAHHA, AHHAADAHHA, and AHHAADAHH that are targeted by 3A4/PTL-3, C1-13, and S2-5-C2-3 MAbs, respectively, were present in all samples (100%). Epitopes DAHHVADAHH and AAYAHHAHHAAY were not present in the field isolates in this study (Figure 3).

Discussion
Pfhrp2 exon 2 sequences from the field isolates of P. falciparum showed substantial sequence diversity. We reported the sequence length, epitope type, and frequency and predicted the sensitivity of HRP2-RDT detection.
A total of 39 amino acid sequences were generated, ranging in length from 172 to 259 amino acids. The possible causes of the differences in length are frequent breaks and

Discussion
Pfhrp2 exon 2 sequences from the field isolates of P. falciparum showed substantial sequence diversity. We reported the sequence length, epitope type, and frequency and predicted the sensitivity of HRP2-RDT detection.
A total of 39 amino acid sequences were generated, ranging in length from 172 to 259 amino acids. The possible causes of the differences in length are frequent breaks and joining in chromosome 8 during meiosis and mitosis. The gene has about eight breaking points, and, every time, a new sequence is generated leading to the observed variation in length and arrangement [22,[37][38][39]. Studies have demonstrated that this could be a normal mechanism in the parasite and ultimately can lead to polymorphism in the gene. In Tanzania, amino acid lengths ranging from 207 to 287 have been observed, which is also the case in the global range of amino acid lengths [27].
Following Baker's amino acid classification, we reported the existence of 15 of 24 (62.5%) amino acid repeats, of which 12 repeats were also previously found in Tanzania [27]. Amino acid repeat types AAY, AHHAHHAAN, and AHHAA are new and hereby reported for the first time in the field isolates from Tanzania. Only one repeat type (ARHAAD) was previously reported but not in the current study [27]. It is argued that the recombination of polyclonal infection of P. falciparum particularly in high transmission areas can result in the diversity and emergence of different polymorphisms in the pfhrp2 gene [27]. Several studies have demonstrated the possibility of reduced sensitivity and overall performance of RDT due to the sequence variation in the pfhrp2 gene [24,28].
The results of the sequence analysis from this study showed that types 2 and 7 amino acid repeats are common in most samples, occurring at a high prevalence but at different frequencies. These two types are believed to form the basis of major epitopes, although the overall function of these repeats in the functional mechanism of HRP2 in P. falciparum is not known [40,41]. Different studies have shown a significant association between the frequency of the two types and the performance of RDT at different parasitaemia levels [28,42,43]. The results of our analysis based on a combined frequency between types 2 and 7 indicate that 69% of the samples had a score of <43 repeats. This score suggests a low frequency of types 2 and 7, which implies a predicted reduced sensitivity to RDT. This is in line with Baker's regression model, which predicts RDT insensitivity, especially in low parasitaemia.
We also found 14 amino acid repeats that are not in Baker's classification (non-Baker repeats). Types ADA and HAAD were present at relatively high proportions in all of the samples (100%), suggesting an important role in the physiological system mechanisms of the parasite; that is why it is expressed in high abundance. Studies in Madagascar and Papua New Guinea previously reported some of the non-Baker repeats but at much lower frequencies [26,44]. Their contribution to the efficacy and performance of RDT is yet to be determined, and this calls for further investigation.
In this study, we found 11 of the 13 (85%) major epitopes that are globally targeted by most of the distributed RDT kits. The most prevalent epitopes were DAHHAHHA, AHHAADAHHA, and AHHAADAHH, which were present in all isolates analysed. These findings indicate that RDT kits with monoclonal antibodies targeting these epitopes will optimally perform in the study area. Apparently, the three epitopes also occur in high proportions elsewhere in Africa [26]. Laboratory studies have tested the same MAbs in different field isolates and observed significant differences in reactivity, suggesting that sequence variation and frequency have an impact on RDT performance [23,24].
Genetic diversity in pfhrp2 can potentially result in the expression of more or less complex PfHRP2. Previous studies have shown that high antibodies to PfHRP2 might lead to reduced sensitivity of RDTs, particularly in high transmission areas due to the formation of antibody-PfHPRP2 complexes making the protein unavailable in the plasma. The protein elicits antibodies with a short low half-life since there is no correlation between anti-PFHRP2 titres and the age of study participants [45].
Our study provided evidence of sequence variation in pfhrp2 in the field samples for Tanzania. Comparing our results with a previous study, it is evident that there are significant differences in the amino acid repeats. We could not validate Baker's model to explain the level of RDT performance in this study, but we predicted the effect of pfhrp2 polymorphism on RDT sensitivity in Tanzania. More studies should focus on the correlation between RDT performance in relation to the amino acid repeat types of both "Baker" and "non-Baker".

Conclusions
The findings from this study provided information on pfhrp2 sequence polymorphism and predicted the effect on RDT performance. The data on antigenic epitopes presented in this study will inform on the purchase and supply of effective RDT in Tanzania. There is an urgent need to deploy a novel and unconventional point-of-care test that exploits magnetic resonance in malaria diagnosis [46,47].

Study Limitations
The limited number of samples analysed in this study might have underestimated the effect of amino acid repeats on RDT performance particularly in lower Moshi where malaria prevalence is very low. Recent data from the study areas could highlight a different amino acid repeat pattern. This study could not validate Baker's model based on the field isolates from Tanzania, but it could predict that, in an event of low parasitaemia, RDT could be insensitive. We did not sequence pfhrp3, which is the isoform of pfhrp2 and usually cross-reacts to anti-HRP2 and increases sensitivity to RDT.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement: Not applicable.
Acknowledgments: Our sincere gratitude goes to all study participants in Handeni and Moshi for their involvement in this study. The support we received from the district medical office at both sites is commendable.