In Silico Analysis of Homologous Heterodimers of Cruzipain-Chagasin from Structural Models Built by Homology

The present study gives an overview of the binding energetics of the homologous heterodimers of cruzipain−chagasin based on the binding energy (ΔGb) prediction obtained with FoldX. This analysis involves a total of 70 homologous models of the cruzipain−chagasin complex which were constructed by homology from the combinatory variation of nine papain-like cysteine peptidase structures and seven cysteine protease inhibitor structures (as chagasin-like and cystatin-like inhibitors). Only 32 systems have been evaluated experimentally, ΔGbexperimental values previously reported. Therefore, the result of the multiple analysis in terms of the thermodynamic parameters, are shown as relative energy |ΔΔG| = |ΔGbfrom FoldX − ΔGbexperimental|. Nine models were identified that recorded |ΔΔG| < 1.3, five models to 2.8 > |ΔΔG| > 1.3 and the other 18 models, values of |ΔΔG| > 2.8. The energetic analysis of the contribution of ΔH and ΔS to ΔGb to the 14-molecular model presents a ΔGb mostly ΔH-driven at neutral pH and at an ionic strength (I) of 0.15 M. The dependence of ΔGb(I,pH) at 298 K to the cruzipain−chagasin complex predicts a linear dependence of ΔGb(I). The computational protocol allowed the identification and prediction of thermodynamics binding energy parameters for cruzipain−chagasin-like heterodimers.


Introduction
The protozoan Trypanosoma cruzi (T. cruzi) is known as the etiologic agent of Chagas disease (CD) which is also termed American trypanosomiasis [1,2]. In endemic areas, the main vector of transmission of CD that has its natural ecosystem on the American continent is triatomine bugs (of the genus Triatoma). Other forms of CD transmission are congenital, by blood transfusion, by organ transplantation, orally (by food contaminated with parasites) or by laboratory accident [3,4]. With the purpose of treating CD, the scientific community has directed its efforts mainly in three lines of research: the search and design of new drugs, the design of biomarkers to test the effectiveness of drugs and techniques to diagnose the disease. To fulfil this purpose, a series of pharmacological in Table 1. These systems have a high sequence identity as well as a high structural similarity and share a common inhibition mechanism [19] and a K i with a magnitude in the order of nM or less (Table 1). Therefore, their high specificity and the structural study of their binding make them attractive potentials for the natural design of inhibitors of the peptide structure of proteases, as a protein or as peptide inhibitors for use as drugs in the treatment of diseases caused by pathogenic protozoa.  Turk et al. [24]; b Redzynia et al. [25]; c Wang et al. [26]; d Ljunggren et al. [27]; e Rowan et al. [28]; f Chagas et al. [29]; g Gerhartz et al. [30]; h Lima et al. [31]; i Dos Reis et al. [32]; j Smith et al. [11]; k Stoka et al. [33].
In order to design new CPI against Chagas disease, we studied homologous heterodimers of cruzipain−chagasin; both proteins occur in T. cruzi protozoa. This analysis involved a total of 70 homologous heterodimers that were constructed from the combination of nine CP structures and seven natural CPI structures. The CP structures occur in protozoan pathogens (T. cruzi: cruzipain; T. congolense: congopain; P. falciparum: falcipain 2; and L. mexicana: LMCP and LMCP B) as well as in human (cath_B, cath_H, cath_L and cath_V) and plant (papain) organisms. The inhibitor structures involve natural CPI of protozoan pathogens (T. cruzi: chagasin; and L. mexicana: LEIME) as well as cystatin of human (cyst_A, cyst_B and cyst_C), coturnix (Cj_cyst) and gallus (Gg_cyst) organisms. We analysed papain-like structures such as cystein-protease (CP) biding to the natural inhibitor cystein-protease (CPI) and performed an in silico analysis of their dependence of ∆G b with the variation of pH, temperature and ionic strength.

Results
There is little published information on K i values determined by spectroscopic methods or other instrumental analysis methods of homologs of the cuzipain−chagasin complex (Table 1). These originate from different sources and time periods, have been normalized and can be truly compared. Only 32 systems have been evaluated experimentally, ∆G b experimental values previously reported.
However, using the information obtained from ∆G b prediction of the systems under study and the knowledge of the structural levels (mainly residue sequence information and crystallographic structures of free and complexed protein) of the homologous heterodimers of cuzipain−chagasin, it was feasible to carry out the present study. In order to obtain the optimal visual appreciation of K i values obtained in in vitro experiments previously reported, we present in Figure 1 the (K i ) −1 profile of homologs of the cuzipain−chagasin complex.  Table 1).

Heterodimer Models Constructed by Homology Modeling
Seventy molecular models, homologous cruzipain-chagasin heterodimers (e.g., Figure 2A), were constructed by homology modelling using the SWISS-MODEL server [34]. The details of the templates used are presented in Table 2. Two model quality evaluation parameters are showed. The first score refers to the global model quality estimation (QMQE) from the model building which is presented in a SWISS-MODEL homology modelling report [34] (e.g., Figure 2D) and the second, the protein-protein complex evaluation that is calculated by the PROCOS program [35]. All constructed models that registered a QMQE score > 0.72 or a PROCOS score > 0.75 were optimized. These models were refined with the GalaxyRefineComplex program by interface repacking, which successfully improved homology model structures. This refinement method allows flexibility at the protein interface and in the overall structure to capture conformational changes that occur upon binding [36]. ΔGb was calculated by FoldX, which uses the empirical force field FoldX [22]. Previously, to evaluate energy, the Repair-FoldX tool was used to avoid the presence of incompatibilities between the CP and CPI structure.  Table 1).

Heterodimer Models Constructed by Homology Modeling
Seventy molecular models, homologous cruzipain-chagasin heterodimers (e.g., Figure 2A), were constructed by homology modelling using the SWISS-MODEL server [34]. The details of the templates used are presented in Table 2. Two model quality evaluation parameters are showed. The first score refers to the global model quality estimation (QMQE) from the model building which is presented in a SWISS-MODEL homology modelling report [34] (e.g., Figure 2D) and the second, the protein-protein complex evaluation that is calculated by the PROCOS program [35]. All constructed models that registered a QMQE score > 0.72 or a PROCOS score > 0.75 were optimized. These models were refined with the GalaxyRefineComplex program by interface repacking, which successfully improved homology model structures. This refinement method allows flexibility at the protein interface and in the overall structure to capture conformational changes that occur upon binding [36]. ∆G b was calculated by FoldX, which uses the empirical force field FoldX [22]. Previously, to evaluate energy, the Repair-FoldX tool was used to avoid the presence of incompatibilities between the CP and CPI structure.
Validation of the computational methodology was carried out by evaluating the RMSD parameter of the nine molecular models obtained by homology, using as a reference their respective crystal complex heterodimer structures obtained from the Protein Data Bank as well as the PROCOS score [35]. This comparison of results is showed in Table 3. A more detailed analysis of the QMQE score and the PROCOS score of the 70 molecular models constructed by homology is presented in Table 2. We also performed an analysis of the composition of the residues in the interface of the PDB structures. The corresponding sequence alignment and the conserved residues identified in the interface of homologous heterodimers of cuzipain−chagasin are presented in Appendix A.

∆G b Prediction of Homologs of the Cruzipain−Chagasin Complex
The predicted ∆G b values obtained from the 70 models built by homology for the different systems studied are showed in a range of −21.5 to −9.5 Kcal/mol. In terms of affinity as K a = (K d ) −1 (see Materials and Methods), these results recorded a moderate affinity (10 7 to 10 12 M −1 ) and a high affinity (10 12 to 10 15 M −1 ). The (K d ) −1 data profile is shown in Figure 3.

Energetic Analysis
In order to carry out a multiple analysis of the 70 models of homologous heterodimers of cuzipain−chagasin, we calculated ΔΔG and considered the presence of two ligands that bind to the same receptor. Therefore, the multiple analysis in terms of thermodynamic parameters was from FoldX from Ki

Energetic Analysis
In order to carry out a multiple analysis of the 70 models of homologous heterodimers of cuzipain−chagasin, we calculated ∆∆G and considered the presence of two ligands that bind to the same receptor. Therefore, the multiple analysis in terms of thermodynamic parameters was performed as |∆∆G| = |∆G b from FoldX − ∆G b from Ki | in absolute value (see Section 4). In this analysis, nine models were identified that recorded |∆∆G| < 1.3 (dimer systems as cruzipain in complex with chagasin, cyst_A, cyst_C and Gg_cyst and also cath_H−LEIME, cath_H−cyst_A, cath_L−cyst_A, cath_L−cyst_B and cath_B−cyst_C) and five models to 2.8 > |∆∆G| > 1.3 (systems as cruzipain−cyst_B, papain−cyst_C, LMCP−LEIME, papain−Cj_cyst and papain−Gg_cyst) (see Figure 4). The other twenty models for comparison showed a tendency or profile according to the behaviour between the systems analyzed but not in magnitude (|∆∆G| > 2.8).

Energetic Analysis
In order to carry out a multiple analysis of the 70 models of homologous heterodimers of cuzipain−chagasin, we calculated ΔΔG and considered the presence of two ligands that bind to the same receptor. Therefore, the multiple analysis in terms of thermodynamic parameters was performed as |ΔΔG| = |ΔGb from FoldX − ΔGb from Ki | in absolute value (see Section 4). In this analysis, nine models were identified that recorded |ΔΔG| < 1.3 (dimer systems as cruzipain in complex with chagasin, cyst_A, cyst_C and Gg_cyst and also cath_H−LEIME, cath_H−cyst_A, cath_L−cyst_A, cath_L−cyst_B and cath_B−cyst_C) and five models to 2.8 > |ΔΔG| > 1.3 (systems as cruzipain−cyst_B, papain−cyst_C, LMCP−LEIME, papain−Cj_cyst and papain−Gg_cyst) (see Figure  4). The other twenty models for comparison showed a tendency or profile according to the behaviour between the systems analyzed but not in magnitude (|ΔΔG| > 2.8).  (Table 1). A Graph representation of competitive inhibition of cruzipain with chagasin inhibitor homologs (|ΔΔG i | = |ΔGb from FoldX − ΔGb experimental |). B Graphical record of energy ΔG. The Gibbs equation is defined as ∆G b = ∆H − T∆S and involves two components that contribute energy to the equilibrium, enthalpy (∆H) and entropy (∆S). The predicted ∆G b was obtained at the temperatures of 298, 313 and 323 K. Then, a van't Hoff plots analysis was performed to obtain the thermodynamic terms (∆H and ∆S). Thus, the ∆G b expressed in terms of the contribution of ∆H and ∆S is showed in Figure 5. The magnitude of the contribution of ∆H and ∆S to ∆G are presented in Appendix B.
Finally, we present the results of a multiple analysis of the 14 models of homologous heterodimers of cuzipain−chagasin and their ∆G b (I, pH) profiles based on ∆∆G b (relative binding energy). These studies are present in Appendix B. The typical result of this analysis for the particular case of cruzipain in interaction with the seven inhibitors is shown in Figure 6. energy to the equilibrium, enthalpy (ΔH) and entropy (ΔS). The predicted ΔGb was obtained at the temperatures of 298, 313 and 323 K. Then, a van`t Hoff plots analysis was performed to obtain the thermodynamic terms (ΔH and ΔS). Thus, the ΔGb expressed in terms of the contribution of ΔH and ΔS is showed in Figure 5. The magnitude of the contribution of ΔH and ΔS to ΔG are presented in Appendix B.

Analysis of Ki Values Obtained in In Vitro Experiments (Reported in The Literature)
Homologous cruzipain-chagasin complexes previously reported, this shows Ki records in the range of nM to fM ( Table 1). The cruzipain-chagasin and cruzipain-cyst_B complexes recorded a Ki

Analysis of Ki Values Obtained in In Vitro Experiments (Reported in The Literature)
Homologous cruzipain-chagasin complexes previously reported, this shows Ki records in the range of nM to fM ( Table 1). The cruzipain-chagasin and cruzipain-cyst_B complexes recorded a Ki

Analysis of K i Values Obtained in In Vitro Experiments (Reported in the Literature)
Homologous cruzipain-chagasin complexes previously reported, this shows K i records in the range of nM to fM ( Table 1). The cruzipain-chagasin and cruzipain-cyst_B complexes recorded a K i of around 0.045 nM. In reference to the cruzipain-cyst_A, cruzipain-cyst_C cruzipain-Gg_cyst complexes have an order of magnitude in K i less than that registered in the cruzipain-chagasin complex. It highlights a particular interest other systems such as cath_L-cyst_C, papain-cyst_C and papain-Gg_cyst complex due to its low registered K i .

Molecular Models Constructed by Homology Modeling and Methodology Validation
The methodology implemented in the development of this work was validated with satisfactory results. The PROCOS results showed a probability-like measure to be a native composition interface for the 70 molecular models analysed. The PROCOS scores collected from the nine PDB structure analysis were values of 0.47 to 1.0 (66.7% of values between 0.8 and 1.0) and 0.59 to 0.99 for the respective molecular models constructed by homology (Table 2). Values with the lowest probability may be an implication that shows very high van der Waals energies [36] such as those of the 3CBK, 3KSE and 3KFQ complex. The global analysis of the molecular model constructed by homology showed a PROCOS score of 75% for values of 0.75 to 1.0. We considered, according to the validation results, that in low scores there is a presence of strong nonspecific interactions due to the presence of aromatic residues at the interface; only 33.3% of models constructed were refined with the GalaxyRefineComplex program (see Table 2). An extensive analysis of the interface of cysteine protease−cystatin structures, such as crystal structures (PDB ID: 1STF, 3K9M, 1NB5, 3KSE and 3KFQ) and two molecular models (chymo−Gg_cyst and papain−Gg_cyst complex) has been presented previously [38].

Analysis of ∆G b Prediction Values
The analysis of ∆G b from cruzipain-CPIs complex (Figure 7) reveals that CPI proteins from protozoan pathogens both cruzipain-chagasin and cruzipain-LEIME complex (−14.5 and −16.3 Kcal/mol, respectively) register a ∆G b magnitude close to cruzipain-cyst_A and cruzipain-cyst_C (both CPIs of human) of −15.4 Kcal/mol. While the cruzipain-cyst_B complex recorded a higher ∆G b value (−12.2 Kcal/mol) compared to that obtained for cruzipain-chagasin complex. Whereas the inhibitors Cj_cyst and Gg_cyst register a considerably lower ∆G b (−18.1 and −17.7 Kcal/mol, respectively) than cruzipain-chagasin complex. Other important correlations in the CP-CPI complex study may emerge from a more detailed analysis of the predictions presented in Figure 7. Of our particular interest is the prediction of a high affinity of chagasin for cathepsins human -cath_B, cath_H, cath_L and cath_V-(∆G b of −15.6 to −20.0 Kcal/mol) as well as bird cystatins (Cj_cyst and Gg_cyst) for cathepsins human and CP from protozoan pathogens (e.g., cruzipain, congopain, falcipain 2, LMCP and LMCP B). Plant CPs such as papain-chagasin complex showed interesting lower ∆G b magnitudes (−17.3 Kcal/mol) compared to cruzipain-chagasin complex.

Molecular Models Constructed by Homology Modeling and Methodology Validation
The methodology implemented in the development of this work was validated with satisfactory results. The PROCOS results showed a probability-like measure to be a native composition interface for the 70 molecular models analysed. The PROCOS scores collected from the nine PDB structure analysis were values of 0.47 to 1.0 (66.7% of values between 0.8 and 1.0) and 0.59 to 0.99 for the respective molecular models constructed by homology (Table 2). Values with the lowest probability may be an implication that shows very high van der Waals energies [36] such as those of the 3CBK, 3KSE and 3KFQ complex. The global analysis of the molecular model constructed by homology showed a PROCOS score of 75% for values of 0.75 to 1.0. We considered, according to the validation results, that in low scores there is a presence of strong nonspecific interactions due to the presence of aromatic residues at the interface; only 33.3% of models constructed were refined with the GalaxyRefineComplex program (see Table 2). An extensive analysis of the interface of cysteine protease−cystatin structures, such as crystal structures (PDB ID: 1STF, 3K9M, 1NB5, 3KSE and 3KFQ) and two molecular models (chymo−Gg_cyst and papain−Gg_cyst complex) has been presented previously [38].

Analysis of ΔGb Prediction Values
The analysis of ΔGb from cruzipain-CPIs complex (Figure 7) reveals that CPI proteins from protozoan pathogens both cruzipain-chagasin and cruzipain-LEIME complex (−14.  The results presented in the FoldX predictions (Figure 3) with the experimental data ( Figure 1) are not comparable; conceptually K i is different from K d . Therefore, the presented magnitudes of K i as experimental data do not need to be of the same order of magnitude as K d obtained by the predictions of FoldX. In a reversible bimolecular mechanism of a simple one-step model, the K d and the equilibrium association constant (K a ) are reciprocally related (K a = K d −1 ) [38]. On the other hand, the inhibition constant (K i ) can be determined or approximated (K i app ) with various enzymatic kinetics from data obtained in an enzyme inhibition experiment (for example, methods that use a nonlinear expression more accurately estimate K i as well as values for K M and V MAX or the linearized approach of Dixon [39]). In ligand binding studies and under controlled conditions, K i can be calculated with the Cheng and Proof equation, where it is assumed that K i ≈ K d = (K a −1 ) [40].
Performing a global analysis of results in terms of ∆G b , there is no clear trend in the behaviour of the energy magnitudes obtained. This variability of magnitudes recorded for ∆G b in the structural homologs of the cruzipain−chagasin complex is inherent in the composition of the residues that constitute their respective interfaces. The energy analysis of the contribution of ∆H and ∆S to ∆G b of the 70-molecular model of homologs of the cruzipain−chagasin complex presents a ∆G b is mostly ∆H-driven ( Figure 5 Papain-like CP play an important role in the regulation of biological cycles of all organisms. Endogenous protein structure inhibitors such as chagasin-like and cystatin-like regulate its biological function. We can appreciate in Figure 1 that cysteine proteases of the protozoan pathogen have greater specificity and a high affinity for their human endogen inhibitors (cystatin-like) compared to their natural inhibitor (chagasin). Regarding the natural chagasin-like inhibitors such as chagasin and LEIME, the predicted ∆G b reveals a high affinity for human cathepsins much better than human cystatins (Figure 7).
Considering the results analyzed, it is possible to formulate two hypothetical events that could occur simultaneously in the invasive process of the pathogenic protozoan ( Figure 8). The first favours the metabolism of the pathogenic protozoan where endogenous CPI of the human ( CPI_hum ) is not able to inhibit the CP of the protozoan pathogen ( CP_pp ); that is, ∆G b CP_pp-CPI_pp << ∆G b CP_pp-CPI_hum . The second event occurs in favour of the parasite defense mechanism due to the presence of endogenous CPI inhibitors of pathogenic protozoan ( CPI_pp ) with a high affinity that can inactivate human CP ( CP_hum ); that is ∆G b CP_hum-CPI_pp << ∆G b CP_hum-CPI_hum . At this moment, the putative situation that we have cited is of great interest to the scientific community and innovative research is being developed in this respect to continue discovering and characterizing new natural inhibitors of this protein structure [7].
predictions of FoldX. In a reversible bimolecular mechanism of a simple one-step model, the Kd and the equilibrium association constant (Ka) are reciprocally related (Ka = Kd −1 ) [38]. On the other hand, the inhibition constant (Ki) can be determined or approximated (Ki app ) with various enzymatic kinetics from data obtained in an enzyme inhibition experiment (for example, methods that use a nonlinear expression more accurately estimate Ki as well as values for KM and VMAX or the linearized approach of Dixon [39]). In ligand binding studies and under controlled conditions, Ki can be calculated with the Cheng and Proof equation, where it is assumed that Ki ≈ Kd = (Ka −1 ) [40].
Performing a global analysis of results in terms of ΔGb, there is no clear trend in the behaviour of the energy magnitudes obtained. This variability of magnitudes recorded for ΔGb in the structural homologs of the cruzipain−chagasin complex is inherent in the composition of the residues that constitute their respective interfaces. The energy analysis of the contribution of ΔH and ΔS to ΔGb of the 70-molecular model of homologs of the cruzipain−chagasin complex presents a ΔGb is mostly ΔH-driven ( Figure 5)  Papain-like CP play an important role in the regulation of biological cycles of all organisms. Endogenous protein structure inhibitors such as chagasin-like and cystatin-like regulate its biological function. We can appreciate in Figure 1 that cysteine proteases of the protozoan pathogen have greater specificity and a high affinity for their human endogen inhibitors (cystatin-like) compared to their natural inhibitor (chagasin). Regarding the natural chagasin-like inhibitors such as chagasin and LEIME, the predicted ΔGb reveals a high affinity for human cathepsins much better than human cystatins (Figure 7).
Considering the results analyzed, it is possible to formulate two hypothetical events that could occur simultaneously in the invasive process of the pathogenic protozoan ( Figure 8). The first favours the metabolism of the pathogenic protozoan where endogenous CPI of the human ( CPI_hum ) is not able to inhibit the CP of the protozoan pathogen ( CP_pp ); that is, ΔGb CP_pp-CPI_pp << ΔGb CP_pp-CPI_hum . The second event occurs in favour of the parasite defense mechanism due to the presence of endogenous CPI inhibitors of pathogenic protozoan ( CPI_pp ) with a high affinity that can inactivate human CP ( CP_hum ); that is ΔGb CP_hum-CPI_pp << ΔGb CP_hum-CPI_hum . At this moment, the putative situation that we have cited is of great interest to the scientific community and innovative research is being developed in this respect to continue discovering and characterizing new natural inhibitors of this protein structure [7]. The agonist interactions between CP and CPI allow correct metabolic function of the living organism. In the infective process of a human organism by a pathogenic protozoan, the presence of both CP and exogenous PCI (from the protozoan pathogen) is a determinant that causes the disease because the defense mechanism of the human organism is affected or interrupted. A potential solution to these antagonistic interactions that could be applied in the pharmacological treatment of the disease caused by pathogenic protozoa is the exogenous insertion of natural plant CPI or CP [7].
In agreement with the previously reported information about the experimental K i heterodimer homologs of cruzipain−chagasin (Table 1) and the respective analysis presented in this work, we suggest that the presence of highly specific cross-linkers and chagasin for homologous agonists (cathepsins and cystatins) in humans is a key point of the evolutionary persistence of Chagas disease. The development of new CPIs with a protein structure and peptide inhibitors derived from these protein structures is based on the study of the interaction of natural PCI such as chagasin-like and cystatin-like inhibitors in order to develop new antiparasitic drugs that have non-toxic properties, high specificity, biological selectivity and potent and reversible inhibition.

Materials and Methods
The protocol used to study the 70 homologous heterodimers of cruzipain−chagasin was divided into three stages (the methodology is presented in Figure 9): (i) Identification templates in RCSB PDB and alignment of sequences obtained from UniProtKB; (ii) Model building with the SWISS MODEL [34] and refinement of model interfaces with GalaxyRefineComplex [36] and stage (iii) Analysis of binding energy (∆G b ) with the FoldX suite [22].
inhibitor as a like-agonist interaction; (B,C) Enzyme−natural inhibitor as a like-antagonist interaction. Alternative interaction to regulate the biological metabolism in the human organism by an exogenous agent such as an enzyme or natural inhibitor.
The agonist interactions between CP and CPI allow correct metabolic function of the living organism. In the infective process of a human organism by a pathogenic protozoan, the presence of both CP and exogenous PCI (from the protozoan pathogen) is a determinant that causes the disease because the defense mechanism of the human organism is affected or interrupted. A potential solution to these antagonistic interactions that could be applied in the pharmacological treatment of the disease caused by pathogenic protozoa is the exogenous insertion of natural plant CPI or CP [7].
In agreement with the previously reported information about the experimental Ki heterodimer homologs of cruzipain−chagasin (Table 1) and the respective analysis presented in this work, we suggest that the presence of highly specific cross-linkers and chagasin for homologous agonists (cathepsins and cystatins) in humans is a key point of the evolutionary persistence of Chagas disease. The development of new CPIs with a protein structure and peptide inhibitors derived from these protein structures is based on the study of the interaction of natural PCI such as chagasin-like and cystatin-like inhibitors in order to develop new antiparasitic drugs that have non-toxic properties, high specificity, biological selectivity and potent and reversible inhibition.

Materials and Methods
The protocol used to study the 70 homologous heterodimers of cruzipain−chagasin was divided into three stages (the methodology is presented in Figure 9): (i) Identification templates in RCSB PDB and alignment of sequences obtained from UniProtKB; (ii) Model building with the SWISS MODEL [34] and refinement of model interfaces with GalaxyRefineComplex [36] and stage (iii) Analysis of binding energy (ΔGb) with the FoldX suite [22].

Preparation and Analysis of Sequences
Ten crystal complex structures (ID PDB: 1YVB, 2OUL, 3CBK, 3CBK, 3K9M, 2NQD, 3KSE, 3KFQ and 1STF) and five free protein structures (ID PDB: 1ME3, 1YAL, 1PPO, 3GAX and 2C34) were selected from the RCSB PDB (www.rcsb.org (accessed on February 2019)) [37] as templates (Table 4) for the study of homologous heterodimers of the cruzipain−chagasin complex. All structures were refined by subtracting water molecules and external chains. Energy minimization was performed using the YASARA force field [41]. These PDB structures include CP and CPI that occur in organisms such as protozoan pathogens, plants and humans. The respective sequences were recovered with their accession number from UniProt [42]. The sequences were aligned with those reported in the crystal structures and cut to the same length; these protein sequences were used to build the structure models. Table 4. Reference data of papain-like cysteine proteinases and their natural inhibitor in this study.

Preparation and Analysis of Sequences
Ten crystal complex structures (ID PDB: 1YVB, 2OUL, 3CBK, 3CBK, 3K9M, 2NQD, 3KSE, 3KFQ and 1STF) and five free protein structures (ID PDB: 1ME3, 1YAL, 1PPO, 3GAX and 2C34) were selected from the RCSB PDB (www.rcsb.org (accessed on February 2019)) [37] as templates (Table 4) for the study of homologous heterodimers of the cruzipain−chagasin complex. All structures were refined by subtracting water molecules and external chains. Energy minimization was performed using the YASARA force field [41]. These PDB structures include CP and CPI that occur in organisms such as protozoan pathogens, plants and humans. The respective sequences were recovered with their accession number from UniProt [42]. The sequences were aligned with those reported in the crystal structures and cut to the same length; these protein sequences were used to build the structure models. Table 4. Reference data of papain-like cysteine proteinases and their natural inhibitor in this study.
In order to obtain the thermodynamic terms of Gibbs equation (∆G = ∆H − T∆S), enthalpy (∆H) and entropy (∆S), we used the method based on ∆G b prediction at different temperatures combined with van't Hoff plot analyses (lnK = −∆H/(RT) + ∆S/R; K, equilibrium constant and R, ideal gas constant) to obtain the thermodynamic terms of Gibbs equation (∆G = ∆H − T∆S), enthalpy (∆H) and entropy (∆S). Therefore, we proceeded to generate a typical graph 1/T versus lnK, the slope = −∆H/R and intercept = ∆S/R. Solving the linear equation resulting from the linear fitting, we obtained ∆H = (slope) (−R) and ∆S = (intercept) (R).
According to the equation in the equilibrium system, we can define free energy, ∆G b = −RTlnK a = RTlnK d ; where R is a gas constant, T is temperature and K a is a binding constant (K a = 1/K d , where K d is dissociation constant) [12,38,40]. ∆∆G is defined as The term ∆∆G is a thermodynamic parameter that allows us to identify the ∆G b difference present in a system under study (∆G b I ) in relation to a reference system (∆G b II ). It is well known that if it is present, a ten-fold difference in K d s at 298 K corresponds to a ∆∆G difference of 1.34 Kcal/mol. Therefore, a ∆∆G difference of 1 Kcal/mol at 298 K corresponds to a 5.4-fold difference in K d s. In both hypothetical situations the binding of ligand 2 is weaker than that of ligand 1. In the present work, we use ∆∆G to compare the ∆G b recorded in in silico study with data reported previously.

Conclusions
In the present in silico study, we can suggest that from the scaffolding of the interaction of bird cystatins (Cj_cyst and Gg_cyst), it is possible to design or propose natural inhibitors with a high affinity for protozoan proteases. Therefore, it is possible to rationally design both highly selective antagonistic proteins to CP and CPI from protozoan pathogens. The interface composition is decisive to modulate ∆G b . Chagasin-like and cystatin-like structures in interaction with cruzipain homologs showed that ∆G b is mostly ∆H-driven at neutral pH and at an ionic strength (I) of 0.15 M. Because of their wide structural stability to pH, ionic strength (I) and temperature changes (previously reported in in vitro studies by other researchers) and according to the results presented, we propose Cj_cyst and Gg_cyst as excellent candidates and desirable scaffolding for the design of new cruzipain inhibitors derived from peptide structure. The computational protocol based on Homology Modelling and FoldX predictions allowed the identification and prediction of thermodynamics parameters of binding energy for cruzipain−chagasin-like heterodimers.   Table B1 shows the result of the analysis carried out at 273 K, pH 7.0 and I 0.15 M. Table B1. Profile of the contribution of enthalpy (ΔH) y entropy (−TΔS) to ΔG (Kcal/mol) at 273 K, Figure A1. Identification of residues present in the CP to CPI interaction. Query: Cruzipain (PDB 1ME3), chagasin, LEIME (PDB 2C34) and cystatins. *, residues at binding site; **, numbering; i , cruzipain; ii , chagasin; iii , LEIME; iv , cyst_A; v , Gg_cyst. PDB: A , 3E1Z; B , 2OUL; C 2NQD; D , 3K9M; E , 1YVB; F , 3KSE; G , 3KFQ; H , 3CBK; and I , 1STF.

Appendix B
Appendix B.1 Contribution of ∆H y ∆S to ∆G ∆G b (T) were calculated respectively at 273, 313 and 323 K, I 0.15 M and pH 7.0 (see Materials and Methods). Then, Table A1 shows the result of the analysis carried out at 273 K, pH 7.0 and I 0.15 M.                                 Figure A11. papain−CPIs complex.