Phylogeny of Anopheles (Kerteszia) (Diptera: Culicidae) Using Mitochondrial Genes

Identification of mosquito species is necessary for determining the entomological components of malaria transmission, but it can be difficult in morphologically similar species. DNA sequences are largely used as an additional tool for species recognition, including those that belong to species complexes. Kerteszia mosquitoes are vectors of human and simian malaria in the Neotropical Region, but there are few DNA sequences of Kerteszia species in public databases. In order to provide relevant information about diversity and improve knowledge in taxonomy of Kerteszia species in Peru, we sequenced part of the mitochondrial genome, including the cytochrome c oxidase I (COI) barcode region. Phylogenetic analyses structured all species of mosquitoes collected in Peru into a single clade, separate from the Brazilian species. The Peruvian clade was composed of two lineages, encompassing sequences from Anopheles (Kerteszia) boliviensis and Anopheles (Kerteszia) pholidotus. An. pholidotus sequences were recorded for the first time in Peru, whereas An. boliviensis sequences were for the first time published in the GenBank database. Sequences generated from specimens morphologically identified as Anopheles (Kerteszia) cruzii clustered into three separate clades according to the collection localities of Serra do Mar, Serra da Mantiqueira, and Serra da Cantareira, confirming An. cruzii as a species complex, composed of at least three putative species.


Introduction
Malaria is a mosquito-borne tropical disease that remains an important public health problem in some tropical and subtropical countries. In South America, the majority of cases occur in areas of the However, they can be easily identified by characteristics of male genitalia, as well as fourth-instar larva and pupa [24,25]. Anopheles cruzii has been demonstrated to be a species complex on the basis of cytological differences on the banding patterns of polythene chromosomes [26]. Recently, Oliveira et al. [27], using the mitogenome, showed that the specimens from Serra da Cantareira, São Paulo, Brazil, may belong to distinct lineage that is sister to An. cruzii.
Differences in the ecology and behavior of species can affect the transmission of malaria and the success of the methods used in the control, such as insecticides and insecticide-impregnated bed nets [28]. Therefore, the characterization of species composition of a local population for the design of vector control projects is of vital importance. DNA-based methods have been widely used for species identification. Mitochondrial genes have been broadly employed in studies focusing on taxonomy and phylogeny of the Culicidae and Anophelinae subfamilies. The cytochrome c oxidase subunit I (COI) gene of the mitogenome has been largely used to identify taxonomic groups at species level [29,30]. There are COI sequences available in the GenBank and BOLD databases, including sequences generated for the Mosquito Barcoding Initiative (MBI) (http://barcodeoflife.ning.com/group/mosquitobarcoding).
In the current study, we initially used in the phylogenetic analysis a fragment of mitochondrial cytochrome c oxidase I (COI) that represents the barcode region [31,32]. The aim was to analyze the haplotype diversity and verify the genetic variability of specimens collected in Peru. This mitochondrial region is the main source of reference sequences, and has been successfully used to identify mosquitoes [25,29,33,34] and to evaluate intraspecific variability, especially in Neotropical Anopheles [35][36][37]. This analysis included mosquitoes from Peru, which were previously identified as An. cruzii, An. boliviensis, An. Laneanus, and An. lepidotus using morphological characteristics. Lastly, we generated two more partial sequences (another COI region and ND4, NADH dehydrogenase subunit 4 gene), totalizing 10% of the whole mtDNA (mitochondrial DNA) (≈1500 bp), and compared with reference sequences from a previously published complete mitochondrial DNA genome, to verify whether An. cruzii is a species complex.

Mosquito Sampling and Handling
Females were collected in different localities and dates (Tables 1 and 2). In Brazil, mosquitoes were collected using CDC (Centers for Disease Control) light traps with CO 2 (dry ice) for 12 h, beginning at twilight, or with Shannon trap for 3 h, from 6:00 p.m. to 9:00 p.m. Peruvian mosquitoes were collected with Shannon traps, from 6:00 p.m. to 9:00 p.m. Mosquitoes were killed with chloroform vapor and transported to the laboratory for identification, using morphological taxonomy key by Zavortink [18], Consoli and Lourenço-de-Oliveira [38], and Forattini [12]. Mosquitoes were individually stored at −20 • C in 1.5 mL plastic tubes sealed with parafilm.

Genomic DNA Extraction and PCR Amplification of Mitochondrial Gene Fragments
Mosquitoes were crushed with disposable tips in lysis buffer, and genomic DNA (gDNA) was obtained using PureLink Genomic DNA Purification Kit (Invitrogen), according to the manufacturer's instructions.
A fragment of 710 base pairs (bp) of the barcode region of the mitochondrial COI gene was generated by PCR amplification using the primer pair LCO1490 and HCO2198 [39], following the protocol proposed by Ruiz et al. [40]. A fragment of ≈600 bp of COI gene (gene region used in phylogeny) was obtained as described [41] with the primers COIF (5 -GGA TTA TTA GGA TTT ATT GT-3 ) and COIR (5 -GCA AAT AAT GAA ATT GTT CT-3 ). A fragment of ≈400 bp of the mitochondrial ND4 gene was generated using the primers ND4+ (5 -GTD YAT TTA TGA TTR CCT AA-3 ) and ND4-(5 -CTT CGD CTT CCW ADW CGT TC-3 ) and the PCR program, as described [42].

Sequencing, Alignment, and Phylogenetic Analysis
Amplified fragments were sequenced directly in both directions by BigDye Terminator v3.0 Cycle Sequencing Kit in ABI Genetic Analyzer (Applied Biosystems ® , Foster City, CA, USA), using the corresponding flanking primers. The COI and ND4 sequences obtained from newly sequenced mosquitoes were deposited in the GenBank database (COI: MH589352-MH589421; ND4: MH560305-MH560342). Sequences were aligned with the reference sequences (Table 3) using Clustal X version 1.81 [43]. Alignments were then inspected and edited in MEGA version 5.1 [44].  Two methods of phylogenetic reconstruction were implemented using the Bayesian approach. In the first round of analysis, 624 bp of the COI barcoding region was employed, due the higher availability of sequences of this COI gene region for the Anopheles diversity in the GenBank database, from which all the sequence data of reference Kerteszia species were downloaded. In the second round of analysis, phylogenetic relationship was investigated using ND4 and COI genes. COI and ND4 datasets were concatenated, after trimming the primers off from the sequences-including 338 bp of the ND4 gene, 511 bp of the COI, and 624 bp of the barcode region of the COI, generating a total of 1473 bp. In this analysis only reference sequences from complete mtDNA were used, avoiding an incomplete final matrix and missing data. All sequence accession numbers used in both analyses are provided in Tables 1-3 and added into phylogenetic trees.
The phylogenetic reconstructions were performed using the Bayesian approach implemented in MrBayes v3.2.0 [45]. Bayesian inferences were run with two Markov Chain Monte Carlo searches of 3 million generations, each with sampling of 1 in 300 trees. After a burn-in of 25%, the remaining 15,002 trees were used to generate a 50% majority-rule consensus tree. The results of analyses were visualized using FigTree version 1.4.0 [46]. The topologies were rooted using sequences obtained for species of the subgenera Anopheles or Nyssorhynchus (Tables 2 and 3).

Results
The results of the analyses using only DNA sequences of the COI barcoding region are shown in Figure 1. This approach was used with the aim of increasing the number of sequences in the phylogenetic tree, as there are no complete mitochondrial sequences available for some Kerteszia species such as An. neivai, An. pholidotus, and An. lepidotus. The sequence alignment composed of 85 sequences with 624 bp length-9 from Peru, 26 from Brazil, and 50 references from GenBank-had 168 variable sites. The Bayesian phylogenetic tree (Figure 1) shows that the sequences from nine specimens from Peru grouped together in a strongly supported clade. The Peruvian species clade contains two major groups. One group encompasses sequences from specimens identified as An. boliviensis, An. cruzii, An. laneanus, and An. lepidotus. The second group contains sequences from An. lepidotus that clustered with An. pholidotus from Venezuela. For the second round of Bayesian analysis, approximately 1.5 kb (10% of the whole mtDNA) was employed, including COI and ND4 genes of 39 specimens of An. (Kerteszia) species from Peru and Brazil. Twelve new sequences were blasted with sequences available in GenBank for other Kerteszia species to verify identification and potential presence of stop codon in each sequence. The alignment generated 1473 bp with 371 variable sites. Sequences of Anopheles (Anopheles) species were used as outgroup. All clades recovered in the Bayesian phylogenetic analysis were strongly supported (Bayesian Posterior Probability-BPP = 1). All the Anopheles (Kerteszia) species were grouped in a clade that was separated into two clades: one containing all Peruvian species, and the second clade being formed of species collected in Brazil (Figure 2). The Peruvian species clade was split in two minor groups. Species from Brazil were separated in two clades: one clade composed of An. bellator and An. homunculus, and a second clade composed of An. laneanus and An. cruzii, with four mosquitoes morphologically identified as An. bellator (one of which was collected in 1946). Sequences of An. cruzii from Brazil were separated into clades in accordance with the geographical localities of Serra do Mar, Serra da Mantiqueira and Serra da Cantareira (Figures 2 and 3). For the second round of Bayesian analysis, approximately 1.5 kb (10% of the whole mtDNA) was employed, including COI and ND4 genes of 39 specimens of An. (Kerteszia) species from Peru and Brazil. Twelve new sequences were blasted with sequences available in GenBank for other Kerteszia species to verify identification and potential presence of stop codon in each sequence. The alignment generated 1473 bp with 371 variable sites. Sequences of Anopheles (Anopheles) species were used as outgroup. All clades recovered in the Bayesian phylogenetic analysis were strongly supported (Bayesian Posterior Probability-BPP = 1). All the Anopheles (Kerteszia) species were grouped in a clade that was separated into two clades: one containing all Peruvian species, and the second clade being formed of species collected in Brazil (Figure 2). The Peruvian species clade was split in two minor groups. Species from Brazil were separated in two clades: one clade composed of An. bellator and An. homunculus, and a second clade composed of An. laneanus and An. cruzii, with four mosquitoes morphologically identified as An. bellator (one of which was collected in 1946). Sequences of An. cruzii from Brazil were separated into clades in accordance with the geographical localities of Serra do Mar, Serra da Mantiqueira and Serra da Cantareira (Figures 2 and 3).    Tables 1 and 2).
In order to provide relevant information about diversity and improve knowledge in taxonomy of Kerteszia species from Peru, we sequenced two mitochondrial genes (COI and ND4), including the DNA barcode region.
The analysis of genetic differentiation of COI and ND4 genes indicated that there are at least two Kerteszia species of that occur in Peru. Preliminary identification of specimens of An. cruzii and An. lepidotus using morphological characters of the female was not confirmed by phylogenetic analyses of mitochondrial genes. In fact, specimens from Peru that were identified as An. cruzii had scales on the abdominal terga, indicating that they belonged to other species. In the Bayesian topology, those specimens of An. cruzii with scales on the abdomen terga grouped with most of the mosquitoes from Peru, which were identified as An. boliviensis. In addition, one specimen morphologically identified as An. lepidotus clustered with An. pholidotus, with an identity of 98% in the COI barcoding region. The clade encompassing An. pholidotus sequences was recovered and was well separated from An. lepidotus. Moreover, a Peruvian specimen preliminary identified as An. laneanus clustered with An. boliviensis. For accurate identification of An. boliviensis, An. pholidotus, and An. lepidotus using morphology, it is important to examine characters of the male and fourth-instar larva in addition to female characters [18]. Furthermore, until 2012 [25], the only morphological feature that had been proposed to differentiate females of An. pholidotus from An. lepidotus was the size of the scales on proximal tergites, being moderately wide or broad for An. lepidotus and predominantly narrow to moderately wide for An. pholidotus [18]. This characteristic was shown vague and difficult to interpret. In fact, in other countries of South America also, such as Colombia, molecular analyses showed that the species previously believed to be An. lepidotus corresponded to An. pholidotus [47].
Chromosomal banding pattern of the ovarian polytene chromosomes had already suggested that there are three putative species under the name An. cruzii, termed as A, B, and C [26,55]. More recently, studies employing DNA sequences from nuclear genes have corroborated this hypothesis [56][57][58][59]. Our analysis, using DNA sequences from mitochondrial genes, also supported the hypotheses of An. cruzii representing a complex of morphologically similar species. The sequences employed in this study that were generated from specimens identified as An. cruzii were separated in accordance to their geographical origin. Sequence with the GenBank code KU551284 was recovered in a clade that is sister to An. laneanus and yet separate from other specimens of An. cruzii. This KU551284 sequence was obtained from a mosquito collected in the Parque da Serra da Cantareira, situated in the mountain range north of the municipality of São Paulo [27]. The KU551285 sequence was also sorted separately from all the other cruzii sequences and was obtained from a mosquito collected in Itatiaia National Park, in the Mantiqueira mountain range, between the states of Rio de Janeiro and Minas Gerais [27]. All the other An. cruzii sequences were grouped in a branch composed by sequences from mosquitoes collected only in the Serra do Mar region, a mountainous chain that extends by approximately 1500 km along the coast, from Rio de Janeiro state to the north of Santa Catarina state. Indeed, at least two genetically distinct An. cruzii lineages have been shown to occur in the mountains covered by the Atlantic Forest in South-East Brazil [60]. Here, through another methodology, we confirmed these two populations and showed that a third, coming from another mountain range, the Serra da Cantareira, also separates from the other two in the phylogenetic trees. These three populations of An. cruzii were consistently separated regardless of the methodology used (analysis of mitochondrial or nuclear genes). Moreover, in our phylogenetic trees, the addition of another COI gene region as well as another mitochondrial gene (ND4) has not changed the topology of the tree, showing that the most important information is contained in the barcoding region, confirming that the COI gene is more useful for identifying sibling and cryptic species, as has been demonstrated [51].
We also demonstrated that mosquitoes morphologically identified as An. bellator (CA1, B1, B2, and B3) are likely An. cruzii. In fact, adult females of these species are morphologically similar and identification can be problematic when the adults were damaged during field collections [18]. In addition to the placement of these mosquitoes on the phylogenetic trees, their grouping based on the geographical region of the field collections confirms that An. cruzii is a species complex.

Conclusions
The analysis of genetic differentiation of COI and ND4 genes indicated that there are at least two Kerteszia species occurring in Peru: An. boliviensis and An. pholidotus. The occurrence of these species in Peru is herein registered for the first-time, using COI DNA sequences. Moreover, we confirmed the occurrence of two subpopulations of An. cruzii in the Atlantic rainforest and showed a third population in the mountain range of the Parque da Serra da Cantareira. This population is likely a distinct species that clustered separately from the remaining two lineages of An. cruzii in the phylogenetic trees. One lineage encompasses specimens from Serra do Mar on the Atlantic coast, and the second lineage is formed by An. cruzii from Serra da Mantiqueira. Besides their placement in the phylogenetic tree, the grouping of An. cruzii sequences reflects the geographical origin of the adults (Serra da Cantareira, Serra do Mar, and Serra da Mantiqueira). Thus, these findings add additional support for the An. cruzii species complex.