Terpenic Constituents of Essential Oils with Larvicidal Activity against Aedes Aegypti: A QSAR and Docking Molecular Study

Aedes aegypti is a vector for the arbovirus responsible for yellow fever, Zika and Chikungunya virus. Essential oils and their constituents are known for their larvicidal properties and are strong candidates for mosquito control. This work aimed to develop a quantitative structure–activity study and molecular screening for the search and design of new larvicidal agents. Twenty-five monoterpenes with previously evaluated larvicidal activity were built and optimized using computational tools. QSAR models were constructed through genetic algorithms from the larvicidal activity and the calculation of theoretical descriptors for each molecule. Docking studies on acetylcholinesterase (AChE) and sterol carrier protein (SCP-2) were also carried out. Results demonstrate that the epoxide groups in the structure of terpenes hinder larvicidal activity, while lipophilicity plays an important role in enhancing biological activity. Larvicidal activity correlates with the interaction of the sterol-carrier protein. Of the 25 compounds evaluated, carvacrol showed the highest larvicidal activity with an LC50 of 8.8 µg/mL. The information included in this work contributes to describing the molecular, topological, and quantum mechanical properties related to the larvicidal activity of monoterpenes and their derivatives.


Introduction
Mosquitoes are responsible for more diseases than any other group of arthropods [1]. Mosquito Aedes (Ae.) aegypti acts as a vector for an arbovirus responsible for yellow fever; it is also a vector of Zika and Chikungunya virus and dengue hemorrhagic fever [2]. The World Health Organization (WHO) estimates that approximately 3.9 billion people are at risk of dengue fever; 390 million dengue infections occur annually worldwide. Of an estimated 500,000 people with severe dengue fever that require hospitalization each year, about 2.5% die due to complications [3]. Projections for 2050 have concluded that there is a potential expansion of Ae. aegypti and Ae. albopictus because of climate change implies a potential expansion of these mosquito-borne diseases [4]. In addition to these problems, cases have been identified of mosquitoes resistant to the traditionally used insecticides [5].
Throughout history, plants and insects have coexisted and evolved in parallel. Plants, in turn, have used insects as pollinators and developed a defense mechanism against insect predators [6]. In this context, essential oils and their constituents have turned out to be beneficial bioactive compounds against disease-carrying mosquitoes and other insects. [7,8].
Essential oils are substances of plant origin; they are mixtures of water-insoluble volatile secondary metabolites in different proportions, which are responsible for their biological characteristics [9]. Regarding their chemical composition, essential oils usually have phenylpropanes and terpenes, including aldehydes, alcohols, esters, and ketones. These compounds are responsible for essential oils' fragrance and biological properties [10].
For decades, essential oils and their constituents have been used as repellents and insecticides against different species of insects [11]. For example, essential oils from plants belonging to the botanical families Lamiaceae, Myrtaceae, and Poaceae have been widely reported as repellent and larvicidal agents [12]. Various extracts of Cymbopogon have been used traditionally to repel mosquitoes [13]. This genus produces the most widely used natural repellents in the world and its activity against Ae. aegypti [14,15].
Reports on the pure components' repellent and larvicidal activity are more cases than those related to essential oils. In general, the study of the pure compounds has revealed the synergistic and antagonistic effects of the components of the essential oils, demonstrating that the larvicidal activity is not only associated with the major compounds, but that other molecules present in a lesser proportion also contribute to their activity [16][17][18][19][20].
For all the above, this work aims to conduct in silico studies on the larvicidal activity of terpenes and their derivatives for the generation of predictive mathematical models that can provide insight into the design and rational search for new larvicide agents against Ae. aegypti and elucidate the molecular properties involved, related to its biological activity and mechanism of action.

Quantitative Structure-Activity Relationship
The results of the larvicidal activity of terpenes and derivatives tested are included in Table 1, along with the LC 50 values of the tested compounds. Carvacrol and thymol were the most active compounds, with LC 50 values of 8-11 µg/mL. Although some of the compounds had a higher LC 50 value (higher value of 1150 µg/mL), all the results were used to develop the QSAR models. The chemical structures of the tested compounds are shown in Figure 1. In parenthesis, 95% confidence intervals; essential oil activity is considered significantly different when the 95% CI fails to overlap. ** Chi-square value, significant at p < 0.05 level. In parenthesis, 95% confidence intervals; essential oil activity is considered significantly different when the 95% CI fails to overlap. ** Chi-square value, significant at p < 0.05 level. Analysis of genetic algorithms demonstrates that the number of ring tertiary (nCrt) and the number of phenolic groups (nArOH) are structural descriptors related to Analysis of genetic algorithms demonstrates that the number of ring tertiary (nCrt) and the number of phenolic groups (nArOH) are structural descriptors related to larvicidal activity. In contrast, the number of ketones (nCO) and the number of aliphatic ethers (nROR) are descriptors inversely proportional to the biological activity. In Equation (1), the QSAR model with the most significant statistical significance value is shown. Likewise, Table 2 includes four models obtained by analysis and a plot of the predicted activity versus experimental activity for molecules using a training set for models of Ae. aegypti, is shown in Figure 2a, and Table 3 shows the values of the structural descriptors considered in the QSAR models by the analysis of genetic algorithms.     Regarding molecular properties and larvicidal activity against Ae. aegypti, the best model included the following predictors: Ghose-Crippen octanol/water partition coefficient (ALogP), centralization (CENT), molar refractivity (AMR), and polarity number (POL). The model is expressed as shown in Equation (2) Hydrodihydrocarvone Trans-Isopulegone A plot of the predicted activity versus experimental activity for molecules using a training set for models of Ae. aegypti is shown in Figure 2b. The statistics for the other three QPAR models generated by the analysis of genetic algorithms are included in Table 2. Table 4 shows the values of the topological, molecular, and quantum-mechanic descriptors considered for the QPAR models. AlogP = Ghose-Crippen octanol-water partition coefficient (LogP), Pol = polarity number, CENT = centralization, I = ionization potential, η = hardness chemistry, Qtot = total absolute charge.

DFT Studies
Chemical reactivity study shows that menthol is the monoterpene with the highest values of ionization potential (I), chemical hardness (η), and energy GAP (GAP E ), while γterpinene is the chemical with the higher softness (S) and hydrocarvone the molecule with the highest LUMO energy. The quantum mechanical parameters calculated are presented in Table 5. The values are expressed in electrovolts (eV) except for the dipole moment, whose units are expressed in debyes. In Figure 3, the mapping of the frontier orbitals of the most active molecules is included.

Molecular Docking Studies
The in silico results show that monoterpenes are better able to interact in the active site of AChE, where acetylcholine is catalyzed, in a region positioned between the V196, L218, E220, N226, V228 residues. Table 6 shows the free energy values of monoterpenes and their derivatives on the AChE and amino acids involved in the interaction. Hydrocarvone and hydrodihydrocarvone were the compounds that most efficiently bound in the AChE active site. Figure 4 shows the interactions that take place between the compounds with the highest affinity.

Molecular Docking Studies
The in silico results show that monoterpenes are better able to interact in the active site of AChE, where acetylcholine is catalyzed, in a region positioned between the V196, L218, E220, N226, V228 residues. Table 6 shows the free energy values of monoterpenes and their derivatives on the AChE and amino acids involved in the interaction. Hydrocarvone and hydrodihydrocarvone were the compounds that most efficiently bound in the AChE active site. Figure 4 shows the interactions that take place between the compounds with the highest affinity.  Docking on sterol carrier protein shows that monoterpenes and their derivatives can interact in regions including I19, R24, Q25, V26, and F105 residues. Table 7 shows the binding energies (ΔG) for each terpene evaluated. Compounds are listed according to their binding energy to highlight those that had more favorable values. After the phenolic compounds, γ-Terpinene and limonene were the monoterpenes with the most efficient affinity energy, while hydrocarvone and hydrodihydrocarvone were the compounds with lower energy values. The conformation of limonene and γ-terpinene in the active site of SCP-2 can be seen in Figure 5. A visualization of the interactions that take place between the compounds with the highest affinity and the amino acid residues is observed in Figure 6. Docking on sterol carrier protein shows that monoterpenes and their derivatives can interact in regions including I19, R24, Q25, V26, and F105 residues. Table 7 shows the binding energies (∆G) for each terpene evaluated. Compounds are listed according to their binding energy to highlight those that had more favorable values. After the phenolic compounds, γ-Terpinene and limonene were the monoterpenes with the most efficient affinity energy, while hydrocarvone and hydrodihydrocarvone were the compounds with lower energy values. The conformation of limonene and γ-terpinene in the active site of SCP-2 can be seen in Figure 5. A visualization of the interactions that take place between the compounds with the highest affinity and the amino acid residues is observed in Figure 6.

Discussion
The development of new larvicidal agents as prospects in mosquito control is essential nowadays due to the increased incidence of diseases involving these vectors [1], the territorial expansion because of climate change [4], and the emergence of strains resistant to traditionally used insecticides [5]. Essential oils and their constituents represent an option in the search and development of new larvicidal agents due to their widely reported biological activity [7,8] and physicochemical characteristics that facilitate their extraction or synthesis [9].
Regarding the activity evaluated in vitro, it is observed that the compounds with larvicidal activities below 50 µg/mL were γ-Terpinene, limonene, thymol, and carvacrol. These last two are the most active, with LC 50 values of 10.3 and 8.8 µg/mL, respectively. We included the in silico analysis of monoterpenes that are naturally present in essential oils and some of their derivatives to develop QSAR models, chemical reactivity, and docking molecular analysis, aiming to determine the structural and molecular characteristics that confer larvicidal activity to monoterpenes. QSAR models suggest that the ether and ketones groups hinder biological activity; the same observation was reported by Lima et al. [21], but the effect was not quantified. This is consistent with the observation that monoterpenes containing epoxy groups in their structures have less activity against Ae. aegypti.
QSAR model 2 ( Table 2) presents nCconj descriptor as a structural element directly related to larvicidal activity, suggesting that its presence potentiates the activity; this is confirmed with the activities of 3-carene and perillaldehyde. Limonene and γ-terpinene are the terpenes with the highest larvicidal activity, but both lacks conjugated carbons or hydroxyl groups in their structure; therefore, their activity also depends on other factors. QPAR models consider the AlogP and CENT descriptors as those with a higher contribution to larvicidal activity.
Some studies have shown that the molar refractivity and hydrophilicity properties negatively correlate with coumarin toxicity against Cx. pipens and Ae. aegypti [22]. The present results reinforce the hypothesis that the descriptors associated with terpenes' polarity (AMR, POL) contribute indirectly to their activity. On the other hand, the hydrophobic profile correlates strongly with larvicidal activity, one point that can explain the biological activity of limonene and γ-terpinene. Other studies have also shown the importance of lipophilicity properties in terpenoids and terpenes. In a previous study, models were developed based on the activity of six monoterpenes, and it was noted that when the values of the vapor pressure and lipophilicity were diminished, the lethal concentration against Ae. aegypti also decreased [23]. The strong participation of lipophilicity can be explained, considering that the main channel entrance of the components in the mosquito body is tactile (outer cuticle) because the larvicidal effect was mainly evaluated by immersion of larvae in an aqueous environment where the compound is applied. Thus, the molecule's hydrophobicity plays an important role in poisoning the larvae [24].
Topological indices are numerical identities derived unambiguously from a molecular graph [25,26]. Graph center and related parameters are helpful for coding molecular structure, explaining reaction mechanisms, as well as modeling QSAR [27]. The classical definition is the minimum vertex eccentricity. This definition yielded impractically high numbers of central points [28]. We considered CENT as a topological property of importance in larvicidal activity. Active compounds like 3-carene, limonene, and γ-terpinene have small values of CENT, while the less active terpenes (pulegone epoxide, hydrocarvone and hydrodihydrocarvone) possess high values.
Information derived from the molecular orbital of the analyzed molecules can be used to derive molecular descriptors related to chemical reactivity and physical properties. The energies of the highest occupied molecular orbital (E HOMO ) and the lowest unoccupied molecular orbital (E LUMO ) are among the most common quantum mechanical descriptors used. Other QPAR models obtained by the analysis of genetic algorithms consider the potential of ionization (I) and chemical hardness (η) as two quantum mechanical descriptors directly related to the larvicidal activity. These two descriptors are calculated using the E HOMO and E LUMO . The I (also known as the ionization energy) is the energy that must be provided to a system for the release of an electron, lie the strict meaning of the η corresponds to the resistance of a system to a change or deformation; it is a global property that characterizes the system as a whole, and that relates to the energy gap between the HOMO and LUMO orbitals [29]. Figure 3 shows the distribution of the HOMO and LUMO orbitals in the chemical structures of five compounds evaluated. It shows the uniform distribution of the orbitals in terpinene and limonene, the compounds with relevant larvicidal activity, while the carvone, perillaldehyde, and rotundifolone have a polar distribution. Computational tools indicate the presence of hydrophobic and hydrophilic regions on the molecular surface. However, hydrophobic interactions are more important and contribute to the increased larvicidal activity, demonstrating that aspects related to hydrophobicity are more important than steric properties to explain the biological activity.
Sesquiterpenes have also been shown to have larvicidal and repellent activity. Evidence shows that the activity of these compounds is related to vapor pressure and electronic properties as LUMO parameters [30]. This study shows in their models that the repellent activity is associated with the LUMO parameters and inversely proportional to the polarizability of the sesquiterpenes. Results of this study agree with this assertion.
Carvacrol and thymol were the compounds with the highest larvicidal activity; a similar situation was reported by our research group when evaluating their activity against Culex quinquefasciatus [31]. These two compounds are present in various essential oils, mainly in oregano and thyme, and their larvicidal activity has been widely reported [32,33].
Our research group has previously reported the structural importance of the phenol group in the structure and that, particularly, the position of the hydroxyl group with respect to the aliphatic chain plays an important role in biological activity. While thymol has been shown to have greater antimicrobial activity, in the case of larvicidal activity, carvacrol has been shown to be more relevant [34]. This slight increase in larvicidal activity by carvacrol has also been reported in other investigations [35]. One of these studies demonstrated that the substitution of esters, aldehydes, ethers, and acetic acid for the acidic proton of carvacrol resulted in the maintenance or reduction of larvicidal potency against Ae. aegypti [36].
The larvicidal activity of carvacrol and thymol may be due to different mechanisms of action. Research has suggested that these two compounds act as neurotoxic insecticides, potentiating ligand-gated chloride channels in the nervous system [37,38]. Similarly, it has been reported that thymol may work by blocking octopamine receptors [39]. On the other hand, the role of thymol and carvacrol in interacting with the cholinergic system of insects cannot be overlooked. For example, one study demonstrated the ability of carvacrol to inhibit Aedes albopictus acetylcholinesterase [40].
There are reports that the essential oils and their monoterpenoid components produce neurotoxic poisoning, like that produced by organophosphates and carbamates, where there is an inhibition of the acetylcholinesterase (AChE) enzyme [41,42]. Based on this proposal, we decided to use molecular docking studies on this enzyme to describe the chemical-structural properties involved in the recognition process.
In this receptor, the compounds with higher polarity were those with higher binding efficiencies, with hydrocarvone and hydrodihydrocarvone having the highest efficiency ( Figure 4). However, most of the compounds exhibited higher affinity values in a region outside the acetylcholine active site, in a region between residues N226, A253, T254, V227, V228, T254, D259, H260, S273, V271. This suggests that these compounds could be AChE inhibitors allosterically, or they are not able to interact with the active site. Previous studies have shown that limonene, β-myrcene, linalool, and terpineol are potent inhibitors of AChE [43]. In silico studies have shown that linalool can interact with AChE; these results show that the linalool joins a hydrophobic site interacting with some lipophilic amino acids such as G412, G409, G412, and I413 [44]. Our results also demonstrate the importance of lipophilic residues in this region. However, some authors agree that, in most cases, there is no relationship between the inhibition of AChE and the larvicidal effects of monoterpenes [45,46]. Based on our results, we also suggest that molecular docking studies on AChE do not correlate with larvicidal activity; this could be because docking studies were performed only in one conformational structure without considering other biological properties, such as the presence of other environmental (physiological) molecules. However, docking studies were able to show the recognition properties on AChE of the target compounds that help to explain experimental data. It is also possible that the compounds exert their action on various biological targets, as has been reported in their antiprotozoal and antifungal activity.
Contrary to AChE, molecular docking studies on SCP-2 results correlate with the biological activity and QSAR models. Carvacrol and thymol, for example, are the compounds with the highest larvicidal activity and binding efficiencies. Additionally, higher-polarity compounds possess inefficient energy values; the same is observed with terpenes with epoxide groups, none of which interacted with the residue F105. Lipophilicity plays an essential role in the interaction with SCP-2; this is due to ligands with better affinity values being those that interact with L48, L102 and F105 residues ( Figure 6), a question previously reported when studying the interaction of terpenes and terpenoids with SCP-2 of Culex quinquefasciatus [47]. In addition to carvacrol, limonene and γ-terpinene showed interactions with residues L48 and L102, generating high affinity values.
Absorbed cholesterol is redistributed within the cell, depending on physiological demands. Intracellular transporters carry out the task of mobilizing cholesterol within the cell. One lipid/sterol intracellular transport protein is the sterol carrier protein (SCP-x) [48]. The vertebrate SCP-x is a bipartite protein of a 3-ketoacyl-CoA, thiolase, and SCP-2 at the N and C-terminus [49,50]. The present study highlights the possibility that SCP-2 is also the biological target of terpenes and their derivatives, affecting the catabolism of cholesterol and branched fatty acids and bile acid intermediates [51]. However, experimental studies must be performed to elucidate this effect.

Larvicidal Activities
Larval mortality bioassays were performed according to the WHO recommended methodology [31,47,52] using third instar larvae of the Rockefeller lineage. To obtain the lethal concentration for 50% mortality (LC 50 ) and the 95% confidence interval (CI) values, a Probit analysis was carried out with the mortality data set for each of the compounds. evaluated.
Compounds were purchased from a Sigma-Aldrich (St. Louis, MI, USA) distributor, and their chemical structures are shown in Figure 1.

In Silico Optimization and Descriptors Calculation
Computational analysis was carried out following the methodology previously described by our research group [47]. Each molecular system was constructed and studied by conformational analysis using Spartan 03 software (Wavefunction Inc., CA, USA) [53] and a SYBYL mechanical force field [54]. Subsequently, geometry optimization was performed at the PM3 level [55]. Once the minimum energy geometry was obtained, the descriptors were calculated using Dragon 5.4 (Talete, MI, Italy) [56].
To obtain the chemical reactivity descriptors, the Gaussian 09 (Gaussian Inc., Wallingford, CT, USA) [57] program was used through the Density Functional Theory (DFT) in the aqueous phase, using the functional B3LYP [58,59] in combination with the basic set 6-311G(d,p) in a conductive polarized continuum model (CPCM) [60]. The energy of the frontier orbitals and the Koopmans theorem [61] were applied to calculate the chemical reactivity descriptors.

Structure-Property-Larvicidal Activity Models
Quantitative structure-activity relationship (QSAR) and quantitative property-activity relationship (QPAR) studies were performed using the experimentally calculated CL 50 and the theoretically calculated descriptors of each molecular system. For the construction of the models, genetic algorithms were used, which were evaluated based on four statistical variables to find the most satisfactory models [62].
The initial strategy was based on generating mathematical models exclusively using structural descriptors (QSAR) to determine the most important functional groups in larvicidal activity. Subsequently, models were generated using the physicochemical, topological, and chemical reactivity (QPAR) descriptors to find the molecular properties that stand out and relate to the LC 50 .

Molecular Docking Studies
Docking studies were carried out on sterol carrier protein (SCP2) reported in the RCSB Protein Data Bank (PDB ID: 1PZ4) [63] and acetylcholinesterase (AChE) of Ae. aegypti. The protein sequence of AChE of Ae. aegypti (GenBank: ABN09910.1) [64] was obtained from the database of the National Center for Biotechnology Information (NCBI). The protein was modeled through the Swiss-Model server [65,66], using as template AChE of Mus musculus (PDB ID: 2WU6) [67]. The final model was subjected to Ramachandran analysis using the Rampage server [68]. Docking analysis was done for each molecule tested experimentally on both proteins (AChE modeled and SCP2 PDB ID: 1PZ4) using the AutoDock4 software v4.2.6 (Scripps Research, CA, USA) [69]. For the docking studies, the water molecules were removed from 1PZ4, and the active site was defined considering the residues within a grid of 60 Å × 60 Å × 60 Å centered in the active site, with an initial population of 100 randomly placed individuals and a maximum number of 1.0 × 107 energy evaluations. A blind docking procedure was carried out as well. The compounds for docking were drawn in Gauss view before docking; the compounds were subjected to energy minimization using the hybrid functional B3LYP with a 6, 311G(d,p) basis set. The ∆G (Kcal/mol) values were taken from the conformation with the lowest minimum free energy of the ligand coupled on the protein targets. The figures were prepared with ChemBioOffice v 22.0 (PerkinElmer, MA. USA) [70] for the structures and Chimera 2021 (RBVI, San Francisco, CA, USA) [71] for the proteins and ligands.

Conclusions
The theoretical characterization of terpenes' structural, molecular, and quantum mechanical properties and their derivatives are presented, as well as their relation to the larvicidal activity against Ae. aegypti third-instar larvae. Additionally, it is concluded that the terpenes can interact with AChE and SCP-2 and that this interaction can describe the experimental biological activity data structurally. Using tools such as QSAR and molecular docking will provide a basis for rational design and search for new larvicidal agents, taking advantage of mathematical models with statistical significance and robust tools for predicting the biological activities of terpene compounds.

Data Availability Statement:
The data presented in this study are available on request from the corresponding authors.