Targeting Toxoplasma gondii ME49 TgAPN2: A Bioinformatics Approach for Antiparasitic Drug Discovery

As fewer therapeutic options are available for treating toxoplasmosis, newer antiparasitic drugs that can block TgAPN2 M1 aminopeptidase are of significant value. Herein, we employed several computer-aided drug-design approaches with the objective of identifying drug molecules from the Asinex library with stable conformation and binding energy scores. By a structure-based virtual screening process, three molecules—LAS_52160953, LAS_51177972, and LAS_52506311—were identified as promising candidates, with binding affinity scores of −8.6 kcal/mol, −8.5 kcal/mol, and −8.3 kcal/mol, respectively. The compounds produced balanced interacting networks of hydrophilic and hydrophobic interactions, vital for holding the compounds at the docked cavity and stable binding conformation. The docked compound complexes with TgAPN2 were further subjected to molecular dynamic simulations that revealed mean RMSD for the LAS_52160953 complex of 1.45 Å), LAS_51177972 complex 1.02 Å, and LAS_52506311 complex 1.087 Å. Another round of binding free energy validation by MM-GBSA/MM-PBSA was done to confirm docking and simulation findings. The analysis predicted average MM-GBSA value of <−36 kcal/mol and <−35 kcal/mol by MM-PBSA. The compounds were further classified as appropriate candidates to be used as drug-like molecules and showed favorable pharmacokinetics. The shortlisted compounds showed promising biological potency against the TgAPN2 enzyme and may be used in experimental validation. They may also serve as parent structures to design novel derivatives with enhanced biological potency.


Introduction
The M1 aminopeptidases (also termed aminopeptidase N) are found in living systems and are anchored in cell membrane [1]. These enzymes are vital from a functionality perspective and are important in cell growth, development, maintenance, and defense [2]. They are documented as potential candidates for novel drugs and regarded as potential immunoprotected targets [3]. In the Plasmodium falciparum malaria parasite, a single M1 aminopeptidase enzyme has been reported, classified as PfA-M1, that plays a prime role in hemoglobin digestion to ensure pathogen growth and development in the blood stage of the parasite [4]. Targeted inhibition of PfA-M1 has been unveiled to stop P. falciparum inhibition in both laboratory and animal murine models [5]. Toxoplasma gondii is a causative agent of toxoplasmosis, which can cause serious complications in pregnant women [2,[6][7][8][9]. The congenital pathogen infection results in mental retardation, miscarriage, and hearing and vision problems [6]. Toxoplasmosis management relies on chemotherapy comprising pyrimethamine with either clindamycin or sulfadiazine [10]. Drugs that can treat both active and latent infections are scarce, and thus new biomolecular targets and their inhibitors are needed for drug discovery [11].
The M1 aminopeptidases are of three types in the T. gondii ME49 strain-TgAPN1, TgAPN2, and TgAPN3-and encoded by genes present on different chromosomes [2]. TgAPN1 is experimentally active and shown as immunogenic. TgAPN3 is a metalloexopeptidase enzyme and a functional peptide, despite the N-terminal transmembrane anchor. TgAPN2 has localization within the parasite cytosol and is expressed at all stages of the parasite life cycle. The biological function of TgAPN2 is still not clear; however, it is predicted to play a role in tachyzoite growth. Its upregulation in bradyzoites also makes it an important target for drug development [2].
In contrast to conventional drug discovery, computer-aided drug design (CADD) is considered attractive, as it is more cost-effective and consumes less time and little human resources [12][13][14]. CADD is a multidisciplinary approach and involves structure-based and ligand-based virtual screening of drug libraries against given biological targets [15,16]. The aim of the study was to identify the best binding molecules of T. gondii ME49 TgAPN2 that show stable binding mode associated with lowest binding energy score. The aim was split into several objectives. First, screening of the Asinex drug library was done by using blind screening to identify novel binding sites for inhibitors [17]. The docking analysis includes both identification of compound binding mode as well as binding affinity score [18,19]. As the docking and virtual screening calculations are associated with high false-positive rates, the study was extended by molecular dynamic simulation analysis, which aids in validating hit compound binding modes and interaction stability along the length of simulation time [20]. Molecular dynamic simulation also helps in pointing to protein regions that are more flexible and stable in the presence of compounds and residues that are vital in compound binding and biomolecule catalytic mechanisms. Dynamic simulation analysis is key in determining the binding and chemical interactions of lead molecules to the receptors and deciphering conformation stability during the simulation [21][22][23][24][25][26][27][28][29][30][31][32]. Binding free energy analysis was conducted to revalidate the docking and simulation findings [20,21]. Binding free energy methods rely on simulation trajectories and play a key role in compound structure optimization for lead discovery [25][26][27][28][29][30][31][32]. These methods have gained considerable appreciation in recent years, as they have proved vital in estimation of relative binding free energies [33,34]. Compared to molecular docking-or virtual screening-based approaches, these methods are more significant in predicting compound binding [35][36][37].The outcomes of this study may hasten antiparasitic drug development for controlling T. gondii ME49 diseases. Furthermore, new chemical structures may be designed from the compound's parent structure in order to achieve good biological potency and a safer pharmacokinetic profile.

Molecular Dynamic Simulations
Molecular dynamic simulations were conducted to investigate the physical movements of TgAPN2 in the presence of lead molecules. This helped in understanding the long-term compounds conformational stability with the enzyme and shed light on key molecular features of both receptor and compounds vital in intermolecular interactions. The analysis includes root mean square deviation (RMSD) [38,39], root mean square fluctuation (RMSF) [31,39] and radius of gyration (RoG) [4,5] (Figure 2). The analyses were done by mean of carbon alpha atoms. The RMSD describes the structure variation by measuring distance among the superimposed docked intermolecular conformation over  087 Å). Among the systems, the first had more structural variations than the last two; however, it was still in a very stable range. All these values determine the complexes as highly stable from the docked conformation perspective. It further emphasizes that the compounds' docked conformation with the enzyme is solid and no freedom of energy is available in the systems to detach the ligands. The control compound 1 RMSD is given in Figure S2, which shows mean RMSD of 2.4 Å. The reflects that the control docked complex with the enzyme showed more structure deviations than the filtered complexes. Previous studies reported that RMSD is important to determine structural stability of compounds at the receptor active pocket [39][40][41][42]. This was further validated by RMSF assay, which determines residue-based stability. RMSF analysis is important to underline the key regions of enzymes that play a vital role in compound recognition and long-term binding stability. RMSF, as can be seen in Figure 2B, evaluated that most of the receptor residues lie within the stable range in the presence of compounds. However, the N-terminal and C-terminal of the enzyme are a bit more flexible due to loops that are naturally flexible. The flexibility provides structure adaption to the enzyme during the catalytic process, thereby creating enough room for compounds to attain stable energy. Next, the compact and relaxed 3D conformation of TgAPN2 was investigated using RoG [32,43,44]. This analysis is important to understand the compactness and relaxed nature of the enzyme in the presence of compounds during simulation. Higher RoG determines loose packing of the enzyme's secondary structure elements which may results in ligand detachment. The RoG depicted that the systems are compact with no major variations seen. This analysis complements the RMSD in gaining confidence in the system's intermolecular docked conformation. The RoG value of the system ranged between 43 Å and 45 Å. These values are within the stable range and depict the stable compound binding with the enzyme and higher complex intermolecular stability.

Molecular Dynamic Simulations
Molecular dynamic simulations were conducted to investigate the physical movements of TgAPN2 in the presence of lead molecules. This helped in understanding the long-term compounds conformational stability with the enzyme and shed light on key molecular features of both receptor and compounds vital in intermolecular interactions. The analysis includes root mean square deviation (RMSD) [38,39], root mean square fluc- in ligand detachment. The RoG depicted that the systems are compact w iations seen. This analysis complements the RMSD in gaining confidenc intermolecular docked conformation. The RoG value of the system rang and 45 Å. These values are within the stable range and depict the stable ing with the enzyme and higher complex intermolecular stability.

Binding Free Energy Calculations
The binding free energy calculation by MM-GBSA/MM-PBSA was der to reassess the docked affinity of compounds for Toxoplasma gondi The details of binding free energies are given in Table 2. All the complexe

Binding Free Energy Calculations
The binding free energy calculation by MM-GBSA/MM-PBSA was carried out in order to reassess the docked affinity of compounds for Toxoplasma gondii ME49 TgAPN2. The  Table 2. All the complexes revealed stable energies, with an average value <−36 kcal/mol. The overall energy was dominated by van der Waals forces, which illustrated the complexes were stabilized mainly by weak hydrophobic energy. The TgAPN2-LAS_52160953 complex, LAS_51177972 complex and LAS_52506311 complex had net MM-GBSA binding energy scores of −39.45 kcal/mol, −36.19 kcal/mol, and −39.4 kcal/mol, respectively. High contributions were seen from van der Waals energy: −36.89 kcal/mol for LAS_52160953 complex, −33.60 kcal/mol for LAS_51177972 complex, and −36.02 kcal/mol for LAS_52506311 complex. The electrostatic energy contribution, though, was less contributing than the van der Waals energy, but still played a favorable role in complex stabilization. There were high contributions for LAS_52160953 and LAS_52506311: −14.23 kcal/mol and −14.85 kcal/mol, respectively. The net gas phase energy involved insignificant energy contribution from internal energy. The solvation energy contribution for each complex in MM-GBSA was seen as unfavorable. Similarly, on MM-PBSA analysis, the complexes also revealed stable energies. The net energy MM-PBSA values for TgAPN2-LAS_52160953 complex, LAS_51177972 complex and LAS_52506311 complex were −38.57 kcal/mol, −35.23 kcal/mol and −37.12 kcal/mol, respectively. These energy values were more stable for complexes than reported by MM-GBSA. Binding entropy values of LAS_52160953, LAS_51177972 and LAS_52506311 were 16.14 kcal/mol, 17.31 kcal/mol and 16.78 kcal/mol, respectively.

WaterSwap Analysis
WaterSwap analysis was done to gain more confidence in the intermolecular affinity of complexes. The WaterSwap method is more sophisticated than the MM-GBSA/MM-PBSA method and considers the water molecules' role in bridging ligand to receptor residues. This method has been successfully used in different studies and provided promising results in term of predicting compound binding affinity for the targeted biological macromolecule [27,45]. Three algorithms are used in WaterSwap: thermodynamic integration (TI), free energy perturbation (FEP), and Bennett's. The three algorithms' energy value for each complex is illustrated in Figure 3. All the three systems were confirmed to get maximum stable energy, depicting a strong interacting network.

SwissADME Analysis
In silico prediction of compounds ADMET properties is vital in present drug-discovery pipelines, as it saves cost and time associated with drug failure in clinical trials. The SwissADME details of LAS_52160953, LAS_51177972 and LAS_52506311 are given in Table 3. All the three compounds showed favorable adsorption, distribution, metabolism and excretion [46]. The compounds also revealed good topological polar surface area (TPSA) to allow efficient drug adsorption across the cell membrane [47]. Similarly, the compounds have good gastrointestinal absorption, and thus a high concentration of drugs can reach the site of action. Additionally, the compounds are drug-like, as classified by most rules, specifically Lipinski's rule of five [48], Veber [49], and Egan [50]. Lipinski's rule of five is a famous drug rule and describes that a biologically active drug molecule should possess several physical and chemical properties to be orally active. These properties include suitable molecular weight (<500 Da), hydrogen bond donors (<5), hydrogen bond acceptors (<10), and partition coefficient less than 5. Additionally, the compounds have no pan-assay interference compounds (PAINS) alerts and thus cannot bind to multiple targets and will specifically bind to a single biomolecule [51]. This prevents cross-reaction of the compounds with different proteins/enzymes when administered to the host and may also reduce unwanted and adverse effects. The compounds are synthetically feasible to synthesize. The easy synthesis of the compounds allows their use in experimental testing to disclose their real biological potency by blocking the biological function of the TgAPN enzyme.

SwissADME Analysis
In silico prediction of compounds ADMET properties is vital in present drug-discovery pipelines, as it saves cost and time associated with drug failure in clinical trials. The SwissADME details of LAS_52160953, LAS_51177972 and LAS_52506311 are given in Table 3. All the three compounds showed favorable adsorption, distribution, metabolism and excretion [46]. The compounds also revealed good topological polar surface area (TPSA) to allow efficient drug adsorption across the cell membrane [47]. Similarly, the compounds have good gastrointestinal absorption, and thus a high concentration of drugs can reach the site of action. Additionally, the compounds are drug-like, as classified by most rules, specifically Lipinski's rule of five [48], Veber [49], and Egan [50]. Lipinski's rule of five is a famous drug rule and describes that a biologically active drug molecule should possess several physical and chemical properties to be orally active. These properties include suitable molecular weight (<500 Da), hydrogen bond donors (<5), hydrogen bond acceptors (<10), and partition coefficient less than 5. Additionally, the compounds have no pan-assay interference compounds (PAINS) alerts and thus cannot bind to multiple targets and will specifically bind to a single biomolecule [51]. This prevents cross-reaction of the compounds with different proteins/enzymes when administered to the host and may also reduce unwanted and adverse effects. The compounds are synthetically feasible to synthesize. The easy synthesis of the compounds allows their use in experimental testing to disclose their real biological potency by blocking the biological function of the TgAPN enzyme.

Preparation of Protein Structure and Asinex Library
The crystallographic structure of T. gondii ME49 TgAPN2 was retrieved from the Protein Data Bank using PDB ID 6OIU [2,52]. The PDB structure was selected due to its recent submission to the database and good resolution quality compared to other available structures. The experimental data snapshot of the structure is as follows: method X-ray diffraction, resolution 2.20 Å, R-value free 0.236, R-value work 0.190, and R-value observed 0.193. The global stoichiometry of the structure is monomer. The native ligands were removed from the crystal structures along with water molecules. The missing amino acids were added using the CHARMM-GUI PDB manipulator [53]. The protonation of the structure was carried out at pH 7.4 using the H++ web server [54]. The structure was energy-minimized using steepest descent and conjugate gradient algorithms in UCSF Chimera v1.16 to eliminate steric clashes [55]. After energy optimization, the structure was saved in the PDB for additional analysis. The drug library used in this study was Asinex (https://www.asinex.com/ accessed on 15 November 2022). This library contains 575,302 compounds (updated on February 2023). It provides a cost-effective, diverse, drug-like chemical space. Most of the compounds in the library have a high degree of drug-likeness. The library was imported to PyRx 0.8 software [56]. The library was then energy-minimized using MM2 force field and converted to pdbqt format to make the compounds ready for virtual screening [57].

Virtual Screening
Virtual screening is a regularly used technique to screen drug libraries against any given enzyme/protein active pocket with the goal of identifying molecules that show best fitting with the receptor enzyme [58]. This technique has been used in different studies and provided lead molecules that produced good biological potency against the receptor [59]. The virtual screening process was initiated by docking all the Asinex library using the PyRx 0.8 AutoDock Vina program. The screening was performed against catalytic domain II by specifying the residues His835, His839 and Glu858 [2].The grid box dimensions were 25 Å on the XYZ axes. The grid box was centered on the X-axis = 90.81 Å, Y-axis = 25.40 Å, and Z-axis = 126.93 Å. The grid box in virtual screening was set as such to provide flexibility to the library compounds for binding the cavity across the T. gondii ME49 TgAPN2, where they show high conformational stability. The number of iterations set for each compound in docking was 100. The docking process was validated by redocking of cocrystalized ligand to the same site reported in the crystal structure. The superimposition of crystalized ligand and docked conformation revealed a root mean square deviation (RMSD) value of 0.27 Å, illustrating docking procedure accuracy [38]. For a positive control, compound 1 from Marijanovic et al., 2020 was used [2].

Molecular Dynamic Simulation
Molecular dynamic simulation is a computer-based approach to study physical movements of atoms/molecules in a fixed period of time [20]. The molecules' movements are solved by Newton's law of motion, and behavior is plotted in the form of a graph [60]. The top complexes (LAS_52160953, LAS_51177972 and LAS_52506311) were subjected to molecular dynamic simulations in explicit water. The starting atomic coordinates used were obtained from molecular docking and processed through the AMBER20 simulation package [61]. Amber ff14sb was employed for M1 aminopeptidase TgAPN2, while generalized Amber force field (GAFF) was applied for nonstandard residues as well as the TIP3P water model [62][63][64][65]. Amber ff14sb was used as it has improved potential for side chain torsion potentials. The compounds' partial charges were added using the AM1-BCC method via antechamber program [66]. Each complex was placed at the center of a dodecahedral box with padding distance of 12 Å, followed by charge neutralization by adding an appropriate number of counterions. The number of sodium ions added were 12, 11 and 13 for LAS_52160953 complex, LAS_51177972 complex and LAS_52506311 complex, respectively. The complexes were next subjected to energy minimization for 5000 steps of the steepest descent algorithm in order to remove irregular geometry and steric clashes. During energy minimization, hydrogen atoms were minimized first for 500 cycles. This was followed by entire systems' atom minimization with restraint of 10 kcal/mol Å 2 . Then, nonheavy atoms were lastly energy-minimized for 500 rounds in the presence of 200 kcal/mol Å 2 on the remaining system. Afterward, equilibration was done for 100 ps under canonical ensemble. Then, another equilibration was performed for each system under isothermal isobaric ensemble in the presence of 1 bar pressure. Heating of each system was done gradually and achieved a constant temperature of 310 K. The production run was performed for 200 ns in periodic boundary conditions. The production run was facilitated in NVT (constant temperature, constant volume) ensemble and in the presence of Berendsen algorithm. The SHAKE algorithm and Langevin were considered for maintain restraint on hydrogen bonds and constant temperature, respectively [67,68]. The particle-mesh Ewald method was employed for treating long-range interactions. The cutoff value used for long-range interactions was 1.0 nm. The CPPTRAJ module of AMBER was used for structure stability analysis of systems [69]. Plots were generated using XMGRACE v 5.1 [70] and snapshots of simulation were visualized using VMD v 1.93 [71].

MM/PBSA Binding Free Energy Calculations
The docking programs offer simplified scoring functions for predicting ligand binding affinity for receptors. Therefore, they suffer from several limitations in accurate prediction of binding energy. Validation of the binding affinity of studied compounds was accomplished using molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) and molecular mechanics generalized Born surface area (MM/GBSA) methods as implemented in AMBER MMPBSA.py [29,30,[72][73][74]. In the process, 1000 snapshots were extracted from the simulation trajectories. The MM/PBSA net binding energy was determined using the following equation: During the analysis, the nonpolar energies were estimated using linear combination of pairwise overlaps with water probe radius of 1.4 Å. The β and γ values of nonpolar solvation energy employed were 0 kcal/mol in MMGBSA and 0.092 kcal/mol in MMPBSA. The dielectric constant value in MM-GBSA was set to 1.0. The entropy energy contribution in complex formation was determined using a bash script described by Duan et al., 2016 [75]. The MM-GBSA and MM-PBSA binding energies were reconfirmed by WaterSwap, which is considered more sophisticated in terms of determining water molecules' contribution in ligand binding to the receptor [76,77].

Conclusions
In this work, three high-affinity binders were identified-LAS_52160953, LAS_51177972, and LAS_52506311-against TgAPN2. The molecules achieved stable binding at the enzyme active pocket and produced a stable interacting network of hydrophilic and hydrophobic contacts. The docked complexes also demonstrated stable dynamics with no major structure variations. The compound binding of compounds was confirmed by different binding free energy methods. The binding energy of intermolecular complexes was dominated by van der Waals energy. Additionally, the electrostatic energy had a vital contribution in making the compounds' stable binding mode. Though results of the study are promising, revalidation of biological potency can be ensured by subjecting the compounds to in vitro and in vivo studies. Further, the lead compounds' structures might be used in designing novel promising derivatives that have fewer structure and physicochemical limitations.

Funding:
This study was supported via funding from Prince Sattam Bin Abdulaziz University project number (PSAU/2023/R/1444).

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available within the article.