Molecular Modeling of Chemosensory Protein 3 from Spodoptera litura and Its Binding Property with Plant Defensive Metabolites

Chemosensory perception in insects involves a broad set of chemosensory proteins (CSPs) that identify the bouquet of chemical compounds present in the external environment and regulate specific behaviors. The current study is focused on the Spodoptera litura (Fabricius) chemosensory-related protein, SlitCSP3, a midgut-expressed CSP, which demonstrates differential gene expression upon different diet intake. There is an intriguing possibility that SlitCSP3 can perceive food-derived chemical signals and modulate insect feeding behavior. We predicted the three-dimensional structure of SlitCSP3 and subsequently performed an accelerated molecular dynamics (aMD) simulation of the best-modeled structure. SlitCSP3 structure has six α-helices arranged as a prism and a hydrophobic binding pocket predominated by leucine and isoleucine. We analyzed the interaction of selected host plant metabolites with the modeled structure of SlitCSP3. Out of two predicted binding pockets in SlitCSP3, the plant-derived defensive metabolites 2-b-D-glucopyranosyloxy-4-hydroxy-7-methoxy-1, 4-benzoxazin-3-one (DIMBOA), 6-Methoxy-2–benzoxazolinone (MBOA), and nicotine were found to interact preferably to the hydrophobic site 1, compared to site 2. The current study provides the potential role of CSPs in recognizing food-derived chemical signals, host-plant specialization, and adaptation to the varied ecosystem. Our work opens new perspectives in designing novel pest-management strategies. It can be further used in the development of CSP-based advanced biosensors.


Introduction
Spodoptera litura is a generalist lepidopteran pest with diverse host plant range around the world, including the Asian continent. S. litura has a polyphagous larval stage, which defoliates approximately 300 plant species [1]. It has a vast host choice, ranging from weeds and vegetables to economically

Sequence Alignment and 3D Modeling of SlitCSP3 Protein
For the homology modeling of S. litura CSP3 (SlitCSP3), a protein sequence with 123 amino acid residues was retrieved from the National Center for Biotechnology Information (NCBI) database (GenBank ID: ALJ30214.1). The obtained sequence was used to model the protein structure based on homology techniques using Modeller. The closest homology of the SlitCSP3 protein sequence (46% identity) was found with the crystal structure of chemosensory protein 1 of Bombyx mori (Linnaeus) [18] in the Protein Databank (PDB ID: 2JNT), and thus used as a template for structural modeling. Based on the sequence alignment of the target and template chains, the 102 amino acid residues in the long region  were modeled. However, the full sequence of template CSP1 of B. mori consists of 123 amino acids (GenBank ID: NP_001037065), of which the N-terminal region from 1-18 residues is the signal peptide. Their sequence alignment is shown in Figure 1A, where a very low sequence identity can be observed for the N-terminal. In contrast, a high sequence identity was observed for the rest of the alignment.
Int. J. Mol. Sci. 2020, 21,4073 3 of 15 the long region  were modeled. However, the full sequence of template CSP1 of B. mori consists of 123 amino acids (GenBank ID: NP_001037065), of which the N-terminal region from 1-18 residues is the signal peptide. Their sequence alignment is shown in Figure 1A, where a very low sequence identity can be observed for the N-terminal. In contrast, a high sequence identity was observed for the rest of the alignment. As per the parameters defined by Modeller for a reliable protein model, the parameters obtained for the modeled SlitCSP3 were the E-value of zero, GA341 score of 1.00, ModPipe Quality Score (MPQS) value of 1.4837, and zDOPE score of -1.09, and all calculated values suggested the reliability of the modeled SlitCSP3 protein structure. A cartoon representation of the modeled CSP3 structure is shown in Figure 1B, and its hydrophobic surface is represented in Figure 1C. Consequently, our analysis to check the presence of the signal peptide in SlitCSP3 suggested its presence from residues 1-17 with the cleavage site marked at the 18th residue ( Figure 1D). to see the conservation between SlitCSP3 and B. mori CSP1 precursor sequence. The sequences are well-conserved except at the N-terminus, which was not included during structure modeling. (B) Three-dimensional model predicted through homology modeling. (C) A representation of its hydrophobic surface with orange depicting the hydrophobic core. (D) The prediction of signal peptide sequence present in SlitCSP3 at the N-terminus till 17th residue, and 18th is the cleavage site.
As per the parameters defined by Modeller for a reliable protein model, the parameters obtained for the modeled SlitCSP3 were the E-value of zero, GA341 score of 1.00, ModPipe Quality Score (MPQS) value of 1.4837, and zDOPE score of -1.09, and all calculated values suggested the reliability of the modeled SlitCSP3 protein structure. A cartoon representation of the modeled CSP3 structure is shown in Figure 1B, and its hydrophobic surface is represented in Figure 1C. Consequently, our analysis to check the presence of the signal peptide in SlitCSP3 suggested its presence from residues 1-17 with the cleavage site marked at the 18th residue ( Figure 1D).

Accelerated Molecular Dynamics on the Modeled SlitCSP3
To achieve the best model, an accelerated MD was carried out, which is a bias potential function used to make the simulation "jump over" high energy barriers and to sample rare events [16,17]. An ensemble of structures was obtained after 500 ns long aMD simulation and the structures were clustered to obtain the most probable overall conformation for SlitCSP3.
The darkest blue regions of the free energy landscape (FEL) plot signify the energy minima of possible conformations, out of which the largest region denotes the first cluster ( Figure 2A). A separate region denotes the incompletely folded SlitCSP3 conformation that is separated by a large energy difference (~5 kcal mol −1 ). The whole trajectory was clustered into the top ten clusters, out of which the representative structures of the first four clusters have been superimposed and are depicted in Figure 2B.

Accelerated Molecular Dynamics on the Modeled SlitCSP3
To achieve the best model, an accelerated MD was carried out, which is a bias potential function used to make the simulation "jump over" high energy barriers and to sample rare events [16,17]. An ensemble of structures was obtained after 500 ns long aMD simulation and the structures were clustered to obtain the most probable overall conformation for SlitCSP3.
The darkest blue regions of the free energy landscape (FEL) plot signify the energy minima of possible conformations, out of which the largest region denotes the first cluster ( Figure 2A). A separate region denotes the incompletely folded SlitCSP3 conformation that is separated by a large energy difference (~ 5 kcal mol −1 ). The whole trajectory was clustered into the top ten clusters, out of which the representative structures of the first four clusters have been superimposed and are depicted in Figure 2B. The superimposed cartoon representation of the top clusters. The major conformational differences can be seen in the C-terminus. (C) The root-mean-square deviation calculated for the whole trajectory to assess its stability. (D) The root-mean-square atomic fluctuation calculated for all residues indicates that major changes occurred at the N-terminus through its folding pathway. Figure 2C describes the root mean square deviation (RMSD) graph, the value of RMSD rapidly decreases until 50,000 steps, which show that the structures obtained during this part of the simulation were very different from the reference. There is a full dip at the 100,000th step, In the root mean square fluctuation (RMSF) graph a considerable part of the N-terminal (until 40 residues) show high fluctuation, and then again show a spike from the 60-75th residues, as shown in Figure  2D. Figure 3A shows the post-simulated SlitCSP3 structure. The structure has helix 1 from Arg4 to Asn16, helix 2 from Leu20 to Leu28, helix 3 from Pro35 to Ser53, helix 4 from Glu57 to Ala73, helix 5 from Gln75 to Ala83, and the last helix 6 from Gln90 to Ala100. The secondary structure is depicted The superimposed cartoon representation of the top clusters. The major conformational differences can be seen in the C-terminus. (C) The root-mean-square deviation calculated for the whole trajectory to assess its stability. (D) The root-mean-square atomic fluctuation calculated for all residues indicates that major changes occurred at the N-terminus through its folding pathway. Figure 2C describes the root mean square deviation (RMSD) graph, the value of RMSD rapidly decreases until 50,000 steps, which show that the structures obtained during this part of the simulation were very different from the reference. There is a full dip at the 100,000th step, In the root mean square fluctuation (RMSF) graph a considerable part of the N-terminal (until 40 residues) show high fluctuation, and then again show a spike from the 60-75th residues, as shown in Figure 2D. Figure 3A shows the post-simulated SlitCSP3 structure. The structure has helix 1 from Arg4 to Asn16, helix 2 from Leu20 to Leu28, helix 3 from Pro35 to Ser53, helix 4 from Glu57 to Ala73, helix 5 from Gln75 to Ala83, and the last helix 6 from Gln90 to Ala100. The secondary structure is depicted in Figure 3B. The secondary structural assignment clearly shows that the majority of the structure is composed of long helical regions connected by turns or coils. No extended conformations were found in the post-simulated SlitCSP3 structures. in Figure 3B. The secondary structural assignment clearly shows that the majority of the structure is composed of long helical regions connected by turns or coils. No extended conformations were found in the post-simulated SlitCSP3 structures. The Ramachandran plots were obtained using RAMPAGE software (http://mordred.bioc.cam.ac.uk/~rapper/rampage.php) depicted in Supplementary Figure S1. The shaded regions define the favored (in blue) and allowed areas (in sand color). As evident from the plot, most of the residues fall in the alpha-helical region marked by black squares and triangles. Few of them fall in the extended sheet regions even though no actual beta-strands formed in the SlitCSP3 structure. The residues that form turns are typically present in the extended regions. The other residues marked with orange squares and triangles fall in the "allowed" regions, out of which three residues fall in the left-handed helical regions.

Interaction of SlitCSP3 with Plant-Defensive Metabolites
To confirm the interaction of preselected plant-defensive metabolites DIMBOA, HDMBOA, MBOA, gossypol, and nicotine with the post-simulated structure of SlitCSP3, a high-precision The Ramachandran plots were obtained using RAMPAGE software (http://mordred.bioc.cam. ac.uk/~{}rapper/rampage.php) depicted in Supplementary Figure S1. The shaded regions define the favored (in blue) and allowed areas (in sand color). As evident from the plot, most of the residues fall in the alpha-helical region marked by black squares and triangles. Few of them fall in the extended sheet regions even though no actual beta-strands formed in the SlitCSP3 structure. The residues that form turns are typically present in the extended regions. The other residues marked with orange squares and triangles fall in the "allowed" regions, out of which three residues fall in the left-handed helical regions.

Interaction of SlitCSP3 with Plant-Defensive Metabolites
To confirm the interaction of preselected plant-defensive metabolites DIMBOA, HDMBOA, MBOA, gossypol, and nicotine with the post-simulated structure of SlitCSP3, a high-precision docking protocol was carried out using the Schrodinger's glide module. Out of these five, only three ligands, namely DIMBOA, MBOA, and nicotine, were found to weakly bind with SlitCSP3 at site 1 and only DIMBOA and nicotine bound weakly at site 2. DIMBOA scored the highest with a glide score of -5.40 kcal/mol at site 1 and -5.142 kcal/mol at site 2 followed by nicotine with a glide score of -4.077 kcal/mol at site 1 and -4.838 kcal/mol at site 2. MBOA only showed binding with site 1 with a glide score of -4.214 kcal/mol. Gossypol is a bulkier molecule and would not fit in either binding site, hence it does not show any interaction. An analysis of intermolecular hydrogen bond formation between SlitCSP3 and binding ligands is indicated in Figure 4 and Table 1. The second site demonstrated low binding to selected ligands. The graphical representation of interactions at site 1 and site 2 are provided in Figure 4. docking protocol was carried out using the Schrodinger's glide module. Out of these five, only three ligands, namely DIMBOA, MBOA, and nicotine, were found to weakly bind with SlitCSP3 at site 1 and only DIMBOA and nicotine bound weakly at site 2. DIMBOA scored the highest with a glide score of -5.40 kcal/mol at site 1 and -5.142 kcal/mol at site 2 followed by nicotine with a glide score of -4.077 kcal/mol at site 1 and -4.838 kcal/mol at site 2. MBOA only showed binding with site 1 with a glide score of -4.214 kcal/mol. Gossypol is a bulkier molecule and would not fit in either binding site, hence it does not show any interaction. An analysis of intermolecular hydrogen bond formation between SlitCSP3 and binding ligands is indicated in Figure 4 and Table 1. The second site demonstrated low binding to selected ligands. The graphical representation of interactions at site 1 and site 2 are provided in Figure 4.

Accelerated MD Simulations of SlitCSP3-Ligand Complexes
DIMBOA-bound complexes at the two predicted sites were selected for accelerated MD simulations with weaker boost parameters, owing to the highest docking score. The post-simulation interaction between DIMBOA and SlitCSP3 at site 1 has been shown in a cartoon representation in Figure 5A. Only one hydrogen bond is formed with Thr27, while Leu28, Asp29, Thr77, Gln80, and Ile81 form hydrophobic interactions. To confirm whether binding with DIMBOA has any conformational effect on the SlitCSP3 binding site 1, we calculated the average correlation between atom motions, as shown in Figure 5B. The correlated motions have been marked by square-shaped highlights with the three bars (in black, red and blue), representing the three sides of the triangular-shaped site. It shows slight expulsion of DIMBOA from the binding site cavity ( Figure 5C). To confirm this, we also calculated distances (in Å) as a function of simulation time between certain residues of the binding site and the C atom of DIMBOA ( Figure 5D).

Accelerated MD Simulations of SlitCSP3-Ligand Complexes
DIMBOA-bound complexes at the two predicted sites were selected for accelerated MD simulations with weaker boost parameters, owing to the highest docking score. The post-simulation interaction between DIMBOA and SlitCSP3 at site 1 has been shown in a cartoon representation in Figure 5A. Only one hydrogen bond is formed with Thr27, while Leu28, Asp29, Thr77, Gln80, and Ile81 form hydrophobic interactions. To confirm whether binding with DIMBOA has any conformational effect on the SlitCSP3 binding site 1, we calculated the average correlation between atom motions, as shown in Figure 5B. The correlated motions have been marked by square-shaped highlights with the three bars (in black, red and blue), representing the three sides of the triangular-shaped site. It shows slight expulsion of DIMBOA from the binding site cavity ( Figure  5C). To confirm this, we also calculated distances (in Å) as a function of simulation time between certain residues of the binding site and the C atom of DIMBOA ( Figure 5D).  To analyze the binding affinity of DIMBOA at the two sites, the Molecular Mechanics/ Poisson-Boltzmann Surface Area (MM/PBSA) calculations were reported that account for per-residue contribution in the binding (Figures 5 and 6).
The binding site changes its shape upon interaction with DIMBOA in such a way, that the two-ring systems of DIMBOA can no longer fit ( Figure 6C). The distances calculated between DIMBOA and various site 2 residues show a remarkably similar pattern ( Figure 6D). Shortly after a few nanoseconds of simulation, the distances between DIMBOA and residues Trp78, Ala83, Tyr85, Asp89, Gln90, and Tyr91 increase more than 30 Å. Although this distance seems to decrease around 200 ns and few more times, the majority of DIMBOA position seems to lie far away from the binding site 2. Figure 6E depicts the contribution of each residue in the stabilization of this system where Tyr91 shows the highest contribution followed by Trp78 and Lys79. All these residues lie away from the loop region where DIMBOA is expelled out of the cavity. The results clearly show that site 1 is the most probable binding site in SlitCSP3.

Discussion
In homology modeling, the SlitCSP3 protein sequence demonstrated the closest similarity to the crystal structure of chemosensory protein 1 of B. mori [18]. The N-terminal region of protein could not be modeled due to the lack of a more suitable template. The SlitCSP3 protein model obtained by A similar analysis was carried out for DIMBOA bound at site 2, which includes a triangular loop region before the C-terminus. The site spans from Trp78 to Asn94. As can be seen in Figure 6A, the previously bound DIMBOA has been completely expelled from the site, which can be attributed to the large correlation between atomic motions at the binding site (in Figure 6B). For example, Ala83 and Tyr91 lie exactly opposite to each other in the 3D space and show highly correlated movement. The binding site changes its shape upon interaction with DIMBOA in such a way, that the two-ring systems of DIMBOA can no longer fit ( Figure 6C). The distances calculated between DIMBOA and various site 2 residues show a remarkably similar pattern ( Figure 6D). Shortly after a few nanoseconds of simulation, the distances between DIMBOA and residues Trp78, Ala83, Tyr85, Asp89, Gln90, and Tyr91 increase more than 30 Å. Although this distance seems to decrease around 200 ns and few more times, the majority of DIMBOA position seems to lie far away from the binding site 2. Figure 6E depicts the contribution of each residue in the stabilization of this system where Tyr91 shows the highest contribution followed by Trp78 and Lys79. All these residues lie away from the loop region where DIMBOA is expelled out of the cavity. The results clearly show that site 1 is the most probable binding site in SlitCSP3.

Discussion
In homology modeling, the SlitCSP3 protein sequence demonstrated the closest similarity to the crystal structure of chemosensory protein 1 of B. mori [18]. The N-terminal region of protein could not be modeled due to the lack of a more suitable template. The SlitCSP3 protein model obtained by Modeller was found reliable as per the validation parameters. The cartoon representation in Figure 1C states that the binding cavity where site 1 is marked is hydrophobic. There is a presence of a cleavage site at 18th residue, represented in Figure 1D. It is hypothesized that the complete 123 residue-long CSP3 is a precursor form, and the removal of signal peptide must be the functionally active form. The accelerated molecular dynamics of the modeled structure was carried out to attain a close-to-native SlitCSP3 conformation. An ensemble of structures was obtained. In the SlitCSP3 modeled structure, none of the residues falls into the disallowed or outlier regions, which validates the modeled structure. The significant structural differences were observed in C-terminal folding. The representative structure of the most populated cluster lying in the energy minimum was selected for all further studies.
The root mean square of the coordinates of each "frame" of the simulation in comparison to a reference structure was calculated. RMSD analysis observed full dip at the 100000th step, which shows that the RMSD value is zero; in other words, this frame has the same structure as our reference. The graph goes on to become relatively stable after that with just a few considerable peaks, which means that no considerably significant changes occurred in the structure during the simulation after that. The simulation is more or less stable and has reached equilibrium; or, in other words, has reached "convergence". The RMSF graph measures the deviation of the position of a particle with respect to a reference position over time. It seems that the residues lining the first binding site at helix 4 show the high fluctuation along with the N-terminus residues. The simulated SlitCSP3 structure complements the description of the experimentally determined structure of B. mori CSP1 [18]. Jansen et al. described it as a structure comprised of six helices arranged into a shape of a prism, which also holds for simulated SlitCSP3. The walls of the prism are formed by pairs of helices 1-2 and helices 4-5. The hydrophobic sides are generally composed of leucine and isoleucine; this cavity formed by helices 1, 2, 4, and 5 is the possible binding site. A similarity in the fold structure of B. mori CSP1 was established with chemosensory proteins derived from Mamestra brassicae, CSPMbraA6 [19,20] and Schistocerca gregaria CSPsg4 [21]. It is evident that this site is a highly conserved domain of CSPs and must play a crucial functional role in binding with hydrophobic ligands.
CASTp and binding site module of DS 4.0 shortlisted two binding sites based on area and volume. The docking results of native-like SlitCSP3 with selected plant metabolites were surprising. The structural difference between DIMBOA and HDMBOA is only an extra -CH 3 (-methyl) group present in the latter. Although, this -CH 3 group masks the O 3 atom that forms an H-bond with Arg38 in the case of DIMBOA. This interaction could be crucial to stabilize the overall interaction, which is lacking in the case of HDMBOA. An analysis of intermolecular hydrogen bond formation between SlitCSP3 and binding ligands indicates the importance of Arg32 and Arg38, especially in the case of DIMBOA and MBOA ( Figure 4, Table 1). The second site does not contain any arginine residue, and this explains the low binding of selected ligands.
On monitoring the docked complexes, we were curious to assess their stability and dynamics. To compare the two predicted binding sites on SlitCSP3, we carried out two 500 ns long accelerated MD simulations with weaker boost parameters to capture low-energy conformations. For this, only DIMBOA-bound complexes at the two predicted sites were selected, owing to the highest docking score. The post-simulation interaction between DIMBOA and SlitCSP3 at site 1 indicates that the three sides of binding at site 1 undergo certain correlated motions upon binding with a ligand, which may explain the slight expulsion of DIMBOA from the binding site cavity. It seems that DIMBOA has shifted from its original binding site. Although, all residues, Arg38, Lys45, Gln49, Tyr77, Lys84 show considerably fluctuating distance parameters from DIMBOA, the largest fluctuations are shown by Gln49 and Lys84 after 200 ns. It indicates that DIMBOA moves further away from the helix 3 side of the binding cavity. The end distance between Gln49 and Lys84 from DIMBOA is close to 15 Å. MM-PBSA calculation clarifies that Lys45, Gln49, Ile81 and Lys84 contribute the highest to the free energy of binding with DIMBOA. All these residues engage in hydrophobic interaction while the residues that form H-bond with DIMBOA, like Arg32 and Arg38, show an insignificant contribution to the overall free energy of binding. These results indicate that robust hydrophobic interactions are crucial than H-bond formation in case of binding with SlitCSP3, which also explains the relatively low binding score [22]. The MM/PBSA was reported to perform better in calculating binding free energies than the Molecular Mechanics/Generalized Born Surface Area (MM/GBSA ) method [23].
Accelerated MD simulations of DIMBOA bound at site 2 reported complete expulsion of DIMBOA from the binding site, due to change in the shape of the binding site. Also, owing to the highest dynamic fluctuation reported in the C-terminal region, this interaction is energetically unsupported. Based on these results, it is safe to conclude that the binding of DIMBOA at site 2 is not stable, which in turn indicates that this is not a binding site of SlitCSP3.

Sequence Retrieval and Structure Modeling
The 123 amino acid long putative SlitCSP3 protein sequence was retrieved from the NCBI database (GenBank ID: ALJ30214.1). The initial three-dimensional coordinates for CSP3 were obtained through a homology-based structure modeling. It was carried out through the webserver of Modeller named Modweb (https://modbase.compbio.ucsf.edu/modweb/) [24]. The default "slow" PSI-BLAST [25,26] method was chosen for fold assignment with adequate sampling. The chemosensory protein 1 (CSP1) of B. mori (PDB ID: 2JNT) was identified as the template. Based on the model score, ModPipe quality score, expect value, and z-DOPE value, the best-modeled structure was confirmed. However, as the alignment matched the sequence length of 21-122 amino acid of the target protein to the total 102 residues long template protein, the resultant protein model constituted only 102 amino acid residues. The precursor sequence of CSP1 of B. mori consists of 123 amino acids (GenBank ID: NP_001037065), of which the N-terminal region from 1-18 residues is signal peptide region. A calculation was carried out on the SignalP-5.0 [27] server (http://www.cbs.dtu.dk/services/SignalP-5.0/) to confirm the presence of N-terminal signal peptide in SlitCSP3 sequence.

Accelerated Molecular Dynamics
To improve the quality of the modeled structure and to understand its dynamics, a 500 ns long accelerated molecular dynamics simulation (aMD) was carried out. The solvated SlitCSP3 system was prepared for aMD in six consecutive steps [28]. The pressure and temperature scaling were carried out using Berendsen barostat and Langevin thermostat, respectively. SHAKE bond length constraints were applied to all bonds involving hydrogen. A short classical MD run for 100 ns was also carried out for each aMD run to calculate the torsional and total energy boost parameters.
Following Tyagi et al. [28], for each aMD simulation, particle mesh Ewald summation (PME) was used to calculate the electrostatic interactions, while long-range interactions were calculated with a cutoff of 10.0. The simulation conditions were 300 K temperature and two fs time step. The energy and boost information recorded at 1000 time-step. The NIIF clusters of Szeged and Debrecen, Hungary were sourced for running simulations on GPUs using pmemd.cuda implementation of Amber14.
The aMD simulations required extra parameters E dihed , α dihed , E total and α total , as provided in Equation 1: (1) where N res is the number of peptide residues (102 residues) and N atoms is the total number of atoms in the system, which is 15,860 in the CSP3 system. V avg_dihed and V avg_total are the average dihedral and total potential energies obtained from the 100 ns long classical MD run. The values of coefficients a1,a2 was chosen to be 4 kcal/mol and b1, b2 to be 0.16 kcal/mol based on a study by Pierce et al. [29].
The trajectory was clustered using 'grcarma' [30] based on its version of dihedral principal component analysis (PCA). The dihedral based PCA was also carried out using the cpptraj module. The phi and psi torsion angles were calculated for all the residues, and the covariance matrix was calculated, which also calculates eigenvectors. The first two principal components were reweighted by the Maclaurin series expansion method [31,32].

Grid Generation and Docking
The docking of ligands was directed at specific sites on the target enzymes which were specified by forming a cubic grid (5Ȧ × 5Ȧ × 5Ȧ) around the selected residues using the Receptor Grid Generation platform of Schrödinger's glide module [34,35]. The five ligands were prepared for docking by 2D to 3D molecular conversion using the LigPrep [36] module while using the default OPLS3e force field. All docking calculations were carried out using the Standard Precision (SP) protocol.

Simulations of Docked Complexes and MM/PBSA Analysis
The docked complexes of SlitCSP3 with DIMBOA at two different binding sites were also simulated, but with lesser aggressive boost parameters to effectively capture the low energy conformations involved with ligand binding. Table 2 has a summary of these parameters and their respective energy values. The partial charges and parameters for DIMBOA were calculated using "antechamber" [37] and "leap" tools from AmberTools18 (provided in Supplementary file 1) [38].
To analyze the stability of these complexes during aMD simulations, we carried out Molecular Mechanics/Poisson Boltzmann Surface Area (MM/PBSA) [39] calculations for estimation of the free energy of binding. It is also possible to calculate the free energy decomposition as a contribution from different amino acid residues of the target protein. The GBSA model was used, where the solvent contribution to the free energy calculations was realized using a continuum or implicit solvent model.
In this method, the free energy of binding (∆G bind ) between a protein (P) and ligand (L) to form the complex PL is given as: Equation (2): Equation (3): Equation (4): ∆Gsol = ∆G PB/GB + ∆G SA (4) where, ∆E MM + ∆G sol , and -T∆S are the change of the gas phase molecular mechanical energy, the solvation free energy, and the conformational entropy upon binding [23], respectively. A detailed description of the underlying method is provided by Wang et al. [40,41].

Conclusions
We have performed computer-aided structural modeling of S. litura gut-expressed SlitCSP3 followed by a structure-based analysis of its recognition abilities and binding preferences to different host plant-derived chemical signals. In silico predicted SlitCSP3 structure was close to the native structure of chemosensory proteins, complements CSPs structure predicted experimentally. Out of five selected host plant metabolites (DIMBOA, HDMBOA, MBOA, nicotine, and gossypol), DIMBOA, MBOA, and nicotine demonstrated binding at predicted site 1. However, these ligands bound weakly to the binding site. We have found site 1 to be the appropriate binding site for food-derived metabolites. The MD simulation of SlitCSP3-DIMBOA complex showed the vitality of robust hydrophobic interactions than hydrogen bond formation in binding to SlitCSP3. As the binding cavity is lined by hydrophobic sides of the helices, making it well suited to bind through the hydrophobic interaction [20,22,42]. This is the reason for the low binding score. Arg32 and Arg38, which were crucial in DIMBOA interaction, proved insignificant to the overall free energy of binding on simulation. DIMBOA and nicotine could also bind to site 2 with a low glide score. As CSPs show differential recognition abilities and binding preference, SlitCSP3 might have a different binding preference from the selected host plant metabolites of the current study. Finally, the modeled structure of SlitCSP3 could be tested for its interaction with different ligands/substrates including, pesticides, because it is expressed in the gut [43]. It could be monitored for its binding properties with lipid molecules in the blood and chemical pollutants for designing advanced biosensor chip [44]. Further experimental investigation is required to validate the SlitCSP3 binding modes and substrate specificity through different genetic and molecular techniques.