Cryptic Diversity and Demographic Expansion of Plasmodium knowlesi Malaria Vectors in Malaysia

Although Malaysia is considered free of human malaria, there has been a growing number of Plasmodium knowlesi cases. This alarming trend highlighted the need for our understanding of this parasite and its associated vectors, especially considering the role of genetic diversity in the adaptation and evolution among vectors in endemic areas, which is currently a significant knowledge gap in their fundamental biology. Thus, this study aimed to investigate the genetic diversity of Anopheles balabacensis, Anopheles cracens, Anopheles introlatus, and Anopheles latens—the vectors for P. knowlesi malaria in Malaysia. Based on cytochrome c oxidase 1 (CO1) and internal transcribed spacer 2 (ITS2) markers, the genealogic networks of An. latens showed a separation of the haplotypes between Peninsular Malaysia and Malaysia Borneo, forming two distinct clusters. Additionally, the genetic distances between these clusters were high (2.3–5.2% for CO1) and (2.3–4.7% for ITS2), indicating the likely presence of two distinct species or cryptic species within An. latens. In contrast, no distinct clusters were observed in An. cracens, An. balabacensis, or An. introlatus, implying a lack of pronounced genetic differentiation among their populations. It is worth noting that there were varying levels of polymorphism observed across the different subpopulations, highlighting some levels of genetic variation within these mosquito species. Nevertheless, further analyses revealed that all four species have undergone demographic expansion, suggesting population growth and potential range expansion for these vectors in this region.


Introduction
Malaria remains a persistent global public health challenge, and countries in Southeast Asia have been assigned the goal of malaria elimination by 2030. This ambitious target highlights the urgency and importance of concerted efforts to combat malaria and reduce its burden in the region. Malaysia has been free of human malaria since 2018 [1], but P. knowlesi, a simian malaria parasite, is the predominant species currently occurring in the country [2]. All countries in SEA have reported the occurrence of P. knowlesi, with the exception of Timor-Leste [3]. It is crucial to consider the WHO [4] recommendation to postpone the certification of a malaria-free status for countries reporting significant P. knowlesi cases in the region. This highlights the importance of ongoing surveillance, monitoring, and control efforts to effectively address the persistence of malaria and prevent the potential reintroduction of human malaria in Malaysia and neighboring countries.

DNA Extraction and Polymerase Chain Reaction
DNA was extracted from the mosquitoes' legs using InstaGene Matrix (Bio-Rad, Hercules, CA, USA) according to the manufacturers' protocol. The extracted DNA was kept at −20 • C until required. All Anopheles mosquitoes from the Leucosphyrus Group obtained in this study, including some archived samples, were further molecularly characterised using the internal transcribed spacer 2 (ITS2) region and mitochondrial cytochrome c oxidase subunit 1(CO1) gene. The ITS2 was amplified by ITS2A and ITS2B primers [27], with the PCR conditions as follows: denaturation at 95 • C for 2 min; 35 cycles of amplification at 95 • C for 30 s; annealing step at 51 • C for 30 s, with elongation step at 72 • C for 1 min; followed by final elongation step of 10 min at 72 • C. LCO1490 and HCO2198 primers [28] were used to amplify the CO1 gene. The PCR conditions were as follows: denaturation at 95 • C for 3 min; 35 cycles of amplification at 95 • C for 1 min; annealing step at 50 • C for 1 min, with elongation step at 72 • C for 1 min; followed by final elongation step of 10 min at 72 • C and held at a temperature of 4 • C. Each reaction mixture of 25 µL contained 5 µL DNA template, 0.5 µM primers, respectively, 0.2 mM dNTP, 3 mM MgCl2, 1 × GoTaq ® Flexi Buffer, and 1.0 U of GoTaq ® DNA polymerase (Promega Corporation, Madison, WI, USA). This master mix was used for both primer sets. Amplicons were subjected to electrophoresis on 1.5% agarose gels. The amplified product was purified from the gel and outsourced for Sanger sequencing (Apical Scientific Sdn. Bhd., Malaysia). Sequences of each species were performed using the Basic Local Alignment Search Tool (BLAST) (http://blast.ncbi.nlm.nih.gov/Blast.cgi, accessed on 12 February 2021) for similarity searches. A species was confirmed by ≥98% identity and query coverage to the deposited sequence.

Data Analyses for Genetic Studies
The study included sequences of mosquitoes collected from Peninsular Malaysia and Sarawak, while additional sequences from Sabah and Selangor were retrieved from the NCBI GenBank, providing a comprehensive representation of mosquito populations across different regions of Malaysia. CO1, ITS2, and combined sequences of each species were aligned using BioEdit (Version 7.2) [29]. Haplotype networks for An. balabacensis, An. cracens, An. introlatus, and An. latens, based on their polymorphic sites, were constructed by using the median-joining method in NETWORK version 5.0.0.1 software (Fluxus Technology LTD, Suffolk, UK). The number of haplotypes in the subpopulations, haplotype diversity [30], and nucleotide diversity [31] were also estimated using DnaSP 5.0 software. Genetic distances among species/populations were calculated using MEGA 11 software [32].
Pairwise genetic differentiation (F ST ) and gene flow (Nm) values between the subpopulations of each species were tested for significance. DnaSP 5.0 software was also used in estimating gene flow using 1000 permutations [33]. The levels of genetic differentiation can be categorized as F ST > 0.25 (great differentiation), 0.05 to 0.25 (moderate differentiation), and F ST < 0.05 (negligible differentiation) [34]. The levels of gene flow can be categorized as Nm > 1 (high gene flow), 0.25 to 0.99 (intermediate gene flow), and Nm < 0.25 (low gene flow) [35]. To analyse the randomness of DNA sequence evolution, a neutrality test was performed by using Tajima's D [36] and Fu's Fs [37] with 1000 simulations. Mismatch analysis, Harpending's raggedness index (Rag) [38], and the R 2 statistic of Ramos-Onsins and Rozas [39] were used to investigate demographic expansion.

Demographic Analyses
The low values of the raggedness index and R 2 statistic from the mismatch distribution tests and the results of the neutrality test indicated that all the vector species studied were expanding. This expansion trend was further supported by the unimodal shape of the graph observed in at least one gene dataset of the mismatch distribution analysis. These findings collectively suggest that the vector populations are undergoing growth and expansion (Figure 2). Despite a multimodal shape being observed in An. latens due to the presence of two distinct lineages, a separate analysis was conducted for each lineage, and a unimodal shape was also observed (unpublished data The mismatch distribution graphs for CO1 and ITS2 of An. introlatus, An. latens, An. cracens, and An. balabacensis are shown in Supplementary Figure S1.

Haplotype Network
An. introlatus from Johor had the highest number of haplotypes for CO1, ITS2, and the combined dataset (red colour). A higher number of haplotypes (n = 25) was observed in the combined network, with the majority originating from Johor. An. latens from Peninsular Malaysia were clustered distantly from Malaysia Borneo in all three CO1, ITS2, and combined network analyses. H4 in An. cracens held all the populations from Perlis, and the other three haplotypes included the Pahang population for CO1. Only H1 (the predominant haplotype) shared populations from Perlis and Pahang for ITS2. The haplotypes from the combined network were connected to each other, as they were from the same population (Pahang) of An. cracens. The "star-like" shape was observed across all four species. Overall, the combined network for An. balabacensis indicated fewer haplotypes (n = 5) compared with CO1 and ITS2 (Figure 3).

Genetic Differentiation (F ST ) and Gene Flow (Nm)
Overall, in the 10 subpopulations, the highest genetic differentiation was observed between Kg. Sg Dara, Perak, and Hutan Lenggor, Johor, for the concatenated sequences of CO1 and ITS2 in An. introlatus (Table 2). Two out of the three subpopulations exhibited high levels of genetic differentiation, and moderate gene flow was detected between most of the subpopulation pairs in An. latens (Table 3). Intermediate genetic differentiation and correspondingly high gene flow were observed between the two subpopulations of An. cracens (Table 4). In contrast, low levels of genetic differentiation and gene flow were found between the Kem Kayu Merarap, Sarawak, and Simpang Utong, Sarawak, subpopulations in An. balabacensis (Table 5).

Discussion
This study represents the first attempt to investigate four simian malaria vectors (An. balabacensis, An. cracens, An. introlatus, and An. latens) collected from various sites in Malaysia. Based on the combined sequences of An. introlatus, the samples from Hulu Kalong and Kongsi Balak had the highest genetic diversities among the other subpopulations, suggesting greater genetic variation and potential population differentiation in these specific locations. The population from Perak had zero diversity for all the sequences analysed. This is likely linked to the lack of variation within the population in Perak [40], sampling errors [24], or the low number of samples (n = 5) within the same area. The presence of the same haplotypes in the different subpopulations, such as in CO1-H2, ITS2-H1, and the combined sequences-H2, creates the possibility of inter-breeding and migration among the An. introlatus subpopulations [24]. Hence, a different number of haplotypes and unique haplotype network structures for each gene can be observed.
The Anopheles latens sequences from West and East Malaysia were examined in this study. High haplotype and nucleotide diversities were observed in the subpopulations of East Malaysia (Sabah and Sarawak) for CO1, ITS2, and the combined sequences. This hypothesis needs further attention because low genetic diversity is the typical characteristic of insects in island populations [41][42][43]. The haplotypes of An. latens were separated into two clusters for all three haplotype networks. The haplotype clusters contained sequences from Peninsular Malaysia separated from sequences from Malaysia Borneo. Consequently, it is unknown whether the behaviour and the capability of spreading the simian malaria parasites of these two clusters of An. latens mosquitoes are similar or different.
Additionally, this study revealed the presence of the two genetically distinct An. latens clusters based on the CO1 and ITS2 sequences. Nonetheless, defining a species in Anopheles is challenging, even if it is well studied. Thus, multiple genes and concatenated markers were used to improve the accuracy of the assessments of the genetic population structure [44,45]. It is known that CO1 is commonly used for genetic studies of malaria vectors [46][47][48][49], and ITS2 is the well-known molecular marker for species identifications [50]. Thus, based on the results obtained from these gene markers, it is plausible to consider the existence of two different types of An. latens in Malaysia.
The genetic distances between the two clades of An. latens from Peninsular Malaysia and Malaysia Borneo were relatively high, ranging from 2.3% to 5.2% for CO1 and from 2.3% to 4.7% for ITS2. Notably, a cutoff point of 3% is often used as a threshold for species boundaries in insects. Given that the genetic distances observed between the An. latens populations from these different geographical regions surpassed this threshold, it strongly suggests that they may represent two distinct species. A detailed morphological examination is also warranted to confirm its species status.
The findings demonstrated that geographic distance might have a major effect on the genetic structure of An. latens from the two different geographical regions. The South China Sea significantly divides East and West Malaysia, and this may be a factor leading to the intra-specific genetic discontinuities [51,52]. Therefore, the South China Sea is likely a barrier to gene flow between An. latens from Peninsular Malaysia and Borneo. This is further supported by the high levels of genetic differentiation and moderate gene flow in this study.
Overall, the high haplotype diversity observed in An. cracens in Pahang indicates its population growth, likely resulting from the accumulation of new mutations over time, with the emergence of new haplotypes within the population. Differences were observed between the CO1 and ITS2 markers, which can be attributed to the selection of different molecular markers. Each marker may have distinct mutation or evolutionary rates, varying selection pressures, and different gene constraints [53,54]. Nevertheless, this diversity provided valuable insights into the genetic variation, population structure, and evolutionary processes of the mosquito populations. High gene flow was observed between the Kem Sri Gading and Sg. Ular subpopulation pairs from Pahang, despite being collected from different sites. This can be attributed to the absence of significant geographical barriers between these locations. Furthermore, moderate and low levels of genetic differentiation were observed between the subpopulation pairs. The Sg. Ular subpopulation, located within a durian plantation, and the Kem Sri Gading subpopulation, designated as a reserved forest for camping, tracking, and cycling activities, exhibited distinct ecological characteristics. The association between the geographic distance and the anopheline population genetic structure likely differed by species, most likely due to variations in breeding sites, breeding patterns, and behaviour [55]. The gene flow or genetic differentiation of An. cracens in the study were not dependent on the geographical distances. Gene flow among the vectors of malaria due to the lack of geographical distance and barriers was observed in several studies [55][56][57][58][59].
The samples from different study sites could belong to the same, single continuous population because of the low genetic differentiation detected among the subpopulation pairs in both genes for An. balabacensis. Nevertheless, low levels of gene flow were also observed from the overall subpopulations in the combined sequences, yet high gene flow was seen in the COI and ITS2 analyses. Although the sampling distributions varied, the ecological habit is relatively similar in Sabah and Sarawak [60]. Thus, the gene flow occurred easily without significant physical barriers.
The "star-like" network observed in all four vectors suggests population expansion [61,62]. Furthermore, the unimodal shape and low values of the raggedness index and R 2 statistic from the mismatch distribution tests, along with the negative values from the neutrality tests, further support the population expansion of An. balabacensis, An. introlatus, An. latens, and An. cracens in Malaysia. Likewise, evidence of demographic expansion has also been reported in other insects, such as dragon flies, black flies, and buffalo flies, in Southeast Asia [63][64][65].

Conclusions
A deeper understanding of the genetic diversity of local vectors could provide valuable information for epidemiological surveillance and malaria vector control strategies. The presence of the different lineages in the An. latens populations could be a topic of interest for future studies, as it would allow for the identification of which genotypes are more likely to be exposed to simian malaria infection in the wild. Furthermore, in this study, the collection sites covered most of the P. knowlesi malaria endemic regions in Malaysia, yet future research should conduct extensive sample collection in a wider distribution area to obtain a complete overview of the genetic structure of the vectors. Thus, in light of the elimination of malaria, it is timely for Southeast Asian nations to commit a concerted effort to study the vectors and to develop vector control strategies to prevent future outbreaks.