In Silico Screening for Novel Leucine Aminopeptidase Inhibitors with 3,4-Dihydroisoquinoline Scaffold.

Cancers are the leading cause of deaths worldwide. In 2018, an estimated 18.1 million new cancer cases and 9.6 million cancer-related deaths occurred globally. Several previous studies have shown that the enzyme, leucine aminopeptidase is involved in pathological conditions such as cancer. On the basis of the knowledge that isoquinoline alkaloids have antiproliferative activity and inhibitory activity towards leucine aminopeptidase, the present study was conducted a study which involved database search, virtual screening, and design of new potential leucine aminopeptidase inhibitors with a scaffold based on 3,4-dihydroisoquinoline. These compounds were then filtered through Lipinski’s “rule of five,” and 25 081 of them were then subjected to molecular docking. Next, three-dimensional quantitative structure-activity relationship (3D-QSAR) study was performed for the selected group of compounds with the best binding score results. The developed model, calculated by leave-one-out method, showed acceptable predictive and descriptive capability as represented by standard statistical parameters r2 (0.997) and q2 (0.717). Further, 35 compounds were identified to have an excellent predictive reliability. Finally, nine selected compounds were evaluated for drug-likeness and different pharmacokinetics parameters such as absorption, distribution, metabolism, excretion, and toxicity. Our methodology suggested that compounds with 3,4-dihydroisoquinoline moiety were potentially active in inhibiting leucine aminopeptidase and could be used for further in-depth in vitro and in vivo studies.


Introduction
Cancer incidence and mortality are rapidly increasing, and cancer is presently the most important barrier to increase life expectancy worldwide. According to the World Cancer Report published by the World Health Organization in 2020, cancer is the first or second cause of premature death (i.e., in the age group of 30-69 years) in 134 of 183 countries, and it ranks third or fourth in additional 45 countries [1]. The most commonly diagnosed types of cancer in both genders are lung cancer, female breast cancer, colorectal cancer, prostate cancer, stomach cancer, and liver cancer. Lung cancer remains the leading cause of cancer incidence and mortality globally [2].
Many studies have shown that overexpression of the metalloenzyme leucine aminopeptidase (LAP) plays a role in the proliferation, migration, and invasion of tumor cells and in angiogenesis [3,4]. Leucine aminopeptidase 3 (LAP 3) was reported to be involved in the histological grade, lymph node Our findings regarding the activity of diethyl 6,8-dibenzyloxy-3,4,-dihydroisoquinoline-3,3dicarboxylate were consistent with those of other authors such as Bermejo et al. [15]. These authors discovered isoquinolines with antitumor activity that target the G1 phase of the cell cycle. As G1 phase of the cell cycle is an important period and its malfunction is critical for tumorigenesis and tumor progression, the agents with the ability to arrest cells in the G1 phase can be highly promising new therapeutic agents against human cancers.
Several studies have reported phosphonic acid analogues of leucine as inhibitors of both cytosolic and microsomal aminopeptidases [16,17] and phosphonate and phosphonamidate inhibitors as excellent inhibitors of zinc metallopeptidases. Therefore, we performed theoretical studies, including 3D-QSAR on phosphonic/phosphinic acid-containing isoquinoline derivatives [18]. Our previous research provided essential information on the structural characteristics of the inhibitors and gave new insight into the discovery of LAP inhibitors.

Database Search and Library Establishment
Different compound databases such as ZINC [19], PubChem [20], and DrugBank [21] were searched for dihydroisoquinoline analogues. Then, by using Lipinski's "rule of five" [22,23], molecules with less reasonable physicochemical parameters were discarded, leading to the selection of candidates with good drug-like properties: molecular weight (MW) <500 Da, number of hydrogen bond donors <5, number of hydrogen bond acceptors <10, log 10 partition coefficient (logP) <5, and no more than one violation of the abovementioned criteria. A virtual library of approximately 11 585 compounds extracted from ZINC and 13 011 compounds extracted from PubChem was established. DrugBank database did not show any significant hits.

Ligand Growing Experiment
Independent from database search, the design of LAP inhibitors using the ligand growing experiment was performed. The ligand growing experiment started from a starter ligand, diethyl 6,8-dibenzyloxy-3,4-dihydroisoquinoline-3,3-dicarboxylate, which was then grown against a reference molecule, bestatin, using Spark 10.5.6 [24,25] by mapping a different region of the same active site. The fragment growing experiment with Spark identified viable replacements for the selected portion of a reference compound by using a series of fragment databases. In this experiment, molecular field technology was used, which condensed the molecular fields to a set of points around a molecule, termed as field points.
Each of four starter's substituents, namely R 1 , R 2 , R 3 , and R 4 ( Figure 2), was selected for replacement, and a library containing 500 new derivatives was generated each time. Only one substituent (e.g., R 1 ) was replaced each time, and the other substituents (e.g., R 2 , R 3 , and R 4 ) remained unchanged. Finally, 2000 derivatives were obtained, which were filtered through Lipinski's "rule of five" [22,23]. After filtration, a group of 485 compounds with no violation of Lipinski's "rule of five" was chosen for further studies.

Molecular Docking
Virtual molecular docking is a computer-aided technique used for inexpensive and rapid identification of small compounds that bind to specific targets [26]. Virtual docking involves the docking of large libraries of compounds in the binding site of particular targets, thus potential ligands with potential binding affinity against the target can be selected for biological testing. Because the virtual docking method plays a key role in the identification of new compounds for the inhibition of protein targets, this method was used to identify novel LAP inhibitors. In this study, molecular docking was performed using ICM-Pro 3.8-5 software (Molsoft LLC, San Diego, USA) [27,28]. The filtered compounds from the established virtual library, including 11,585 compounds from ZINC, 13,011 compounds from PubChem, and 485 compounds from Spark, were docked into the binding site of 3D crystal structure of blLAP in complex with l-leucinal (PDB code: 1LAN) [29,30]. The protein structure of blLAP was precisely described previously [16][17][18]29,30]. All the generated binding poses were manually inspected to ensure correct positioning within the binding pocket with respect to the interactions of the ligand moieties with the amino acid residues relevant to the inhibitory activity [31]. Residues such as Lys250, Lys262, Met270, Asn330, Ala333, Asp273, Arg336, Thr359, Leu360, Gly362, Ile421, Ala451, and Met454 play an important role in the interactions of LAP with inhibitors [32]. These residues were used as a filter to discard the incorrect poses derived from the docking. Moreover, the compounds were ranked by a docking score. Scoring function was implemented to predict the biological activity by examining the interactions between the compound and potential target [18].
Docking of compounds from the ZINC library to 1LAN resulted in the selection of top 100 compounds with the best score; after duplicate deletion, only 68 compounds were chosen for further studies. The docking scores for the selected poses of the best compounds in the ZINC library were in the range of −25.18 to −51.09. Docking of compounds from the PubChem library resulted in the selection of 100 compounds with docking scores in the range from −28.07 to −51.72.
All the compounds occupied the binding pocket of LAP and showed hydrogen bonds and hydrophobic interactions, which are typical for the inhibitory activity interactions with amino acid residues and zinc ions (Zn488, Zn489) in the active site ( Figure 3).  Figure 4, is placed in a similar way in the binding pocket of 1LAN as the starter ligand was docked in our previous studies [14,18].

VS with Lead Finder
Subsequently, the top 268 highest ranked compounds (68 compounds from ZINC, 100 compounds from PubChem, 100 compounds from Spark) were subjected to docking using Lead Finder (LF) [33,34] and the predicted binding poses were compared with the relevant results obtained from ICM. The docking software LF has three specialized scoring functions designed to rank the predicted ligand poses, estimate the binding energy of the docked ligand poses, and rank compounds in VS experiments. The LF rank score values were in the range of −4.78 to −18.36 for all studied compounds (including rank score values between −7.71 and −16.39, −9.59 and −18.36, and −4.78 and −15.97 for compounds from ZINC, PubChem, and SPARK, respectively). The results of compounds' docking with LF revealed that some ligands bound to LAP better than that found when docked with ICM. Docking using LF resulted in some cases of improved pose of the ligands in the binding pocket in comparison to ligand pose adopted after ICM docking. The ligands were bind similarly to l-leucinal in l-leucinal-blLAP complex (PDB code: 1LAN) ( Figure 5).

3D-QSAR
First, the 3D molecular structures of the dataset containing the selected 100 compounds from PubChem, 68 compounds from ZINC, and 100 compounds from SPARK were aligned to our previously published 3D-QSAR model for the LAN protein [18], and the compounds were then evaluated, using Forge software (10.6.0, Cresset ® , Litlington, Cambridgeshire, UK) [25,35]. The leave-many-out (LMO) cross-validation was performed, by leaving out 20% of the molecules at each step for the same set of training and test set compounds as used in leave-one-out (LOO) cross-validation. The values of the LOO parameters, cross-validation correlation coefficient (q 2 ), non-cross-validated correlation coefficient (r 2 ), number of components (N), and predicted root mean square error (RMSE pred ) were 0.717, 0.997, 4, and 0.661 respectively. The respective analogues values of the LMO parameters q 2 , r 2 , N, and RMSE pred were 0.680, 0.997, 4, and 0.703, respectively. Similar values of parameters in both LOO and LMO cross-validation methods indicated the stability of the 3D-QSAR model. Statistical validation is a very important process for robust QSAR model development. The value of r 2 should be >0.7 and the value of q 2 should be >0.5 in the QSAR model for high predictive accuracy. This also confirmed that a reliable QSAR model was obtained in the present study.
In the entire library of 264 derivatives (after duplicates removal), the model provided excellent and good description for 35 and 51 compounds, respectively. This implies that most of the features of the evaluated molecules were well described by the training set of the 3D-QSAR model, and the predicted activity could be considered reliable. Other values, described as "bad" or "poor", indicate that the molecule has field points that are not specified by the equation, resulting in unreliable predicted activities. The chemical structures and score values of all the compounds with excellent description by the model are presented in Supplementary Materials Table S1. The compounds with excellent description by the 3D-QSAR were predicted to possess pIC 50 (pIC 50 = −log (IC 50 )) activity between 3.8 and 5.7. A linear regression plot of experimental versus calculated pIC 50 values is presented in Supplementary Materials Figure S4.
From 35 compounds with the excellent description by the 3D-QSAR model, 9 hit compounds were selected, which, in addition had the best ICM docking score values, below −32. The score value below −32 is regarded as a good indicator in compound selection by using ICM software [27,28]. After docking using LF and visual inspection, an improvement in ligand binding to LAP protein was observed but score values were not good indicators for ligand selection in that case. Thus, we decided to select ligands based on ICM docking score and results from 3D-QSAR. The chemical structures, pIC 50 and ICM and LF score values of compounds one to nine are presented in Table 1. Schematic diagram of the underlying workflow of the LAP inhibitors drug discovery is shown in Figure 6.

In Silico Drug-Likeness, Bioavailability, and Toxicology Prediction
Drug-likeness studies measure the probability of a molecule to act as an oral drug in terms of its bioavailability [36,37]. As the interaction of an inhibitor with an enzyme cannot ensure its suitability as a drug, to further strengthen the results of 3D-QSAR and docking studies, we also performed in silico absorption, distribution, metabolism, and excretion (ADME) studies by using the SwissADME web tool [36]. The prediction of these properties is very essential in the drug design process and poor ADMET properties often account for the failure of approximately 60% of new chemical entities in clinical trials. Hence, SwissADME is a very valuable and reliable tool. In silico techniques for the prediction of ADMET properties are more attractive than conventional experimental assays because of the large numbers of compounds (both real and designed) that can be tested and because of significant time and cost reduction.
The oral bioavailability of compounds one to nine is shown in bioavailability radar plots (Figure 7). Compounds three, five, seven, and eight were predicted to be orally bioavailable. The other compounds (one, two, four, six, and nine) showed one off-shoot relative to unsaturation (INSATU), which implies that they could have suboptimal physicochemical properties for their oral bioavailability. Gastrointestinal (GI) absorption and blood-brain barrier (BBB) penetration which are relative to the absorption and distribution parameters are graphically presented in Figure 8. All the compounds were predicted to be passively absorbed by the GI tract. Compounds two to five and seven to nine showed BBB permeation, whereas the other compounds did not. None of the compounds were expected to be effluated from the central nervous system (CNS) by P-glycoprotein.  . BOILED-Egg plot of compounds 1-9 with excellent description of the three-dimensional quantitative structure-activity relationship (3D-QSAR) model and the best ICM score values. Points located in the BOILED-Egg yolk (yellow) represent molecules predicted to passively permeate through the blood-brain barrier (BBB), whereas points in the egg white are predicted to be passively absorbed by the gastrointestinal tract (HIA). Blue dots (PGP+) indicate the molecules expected to be effluated from the central nervous system (CNS) by P-glycoprotein, whereas the red ones (PGP-) indicate the molecules predicted not to be effluated from the CNS by P-glycoprotein.
Metabolism plays an important role in the bioavailability of the drugs and drug-drug interactions. Cytochrome P-450 enzymes (CYPs) constitute the most significant class of enzymes for drug-likeness assessment. These enzymes are the site of the majority of drug-drug interactions. Ten human CYPs from seven subfamilies, namely CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, CYP3A4, and CYP3A5 are responsible for the metabolism of most drugs [38]. Most of the compounds were found to be substrates of CYP1A2, except compounds one, five, six, and nine. Compounds seven and eight were substrates of CYP2C19. Compounds three and eight were substrates of CYP2C9 and CYP2D6. None of the compounds were found to be the substrate of CYP3A4 (Table 2).
Skin permeability coefficient (K p ) was also predicted for the selected compounds. The K p values were in the range of -5.19 to -7.27. The more negative the log K p (with K p in cm/s), the less skin permeant is the molecule [36,37]. The bioavailability score value of 0.56 (56%) for most of the compounds and 0.55 (55%) for molecule nine indicates the probability of their bioavailability, and it is based on total charge of compound, topological polar surface area (TPSA), and violation of Lipinski filter (Table 2).
Pan-assay interference compounds (PAINS) and Brenk filters were implemented to provide information regarding potentially problematic fragments (toxic, metabolically unstable, or possessing properties responsible for poor pharmacokinetics), in the chemical structures of compounds one to nine. Both filters showed alerts for compound six, while Brenk also showed alerts for compounds one and two [36] (Table 2).
Lead-likeness of the studied compounds was also calculated. Leads are molecules, which are subjected to chemical modifications, that will most likely increase size and lipophilicity. Thus, leads should be smaller and less hydrophobic than drug-like molecules [36]. Teague et al. [39] suggested that molecules with MW in the range of 100 to 350 Da and calculated logP in the range of 1 to 3 are greatly superior to those considered drug-like. In our study, compounds one, four, and five were shown to be lead-like (Table 2). In addition to Lipinski s "rule of five" [22,23], another four drug-likeness rules namely Ghose [40], Veber [41], Egan [42], and Muegge [43] were contemporarily satisfied for all the selected compounds.
The results of SwissADME calculations provide useful information about the selected compounds. Considering their GI absorption, metabolism through CYPs, and drug-likeness, all of them could be excellent candidates for further studies and manipulations. Moreover, the calculations results showed that compound one was predicted not only to be not metabolized by CYPs, not permeate through BBB and be passively absorbed by GI tract, but also it had superior properties than other compounds in context to its lead-likeness.
Predicted toxicity of compounds one to nine was calculated in admetSAR online tool [44,45]. The results of toxicity prediction are presented in Table 3. None of the compounds exhibited carcinogenicity and eye corrosion. Compounds three, five, seven, eight, and nine were predicted to not cause eye irritation. All of them, besides compound five, did not cause Ames mutagenesis. Compound five and six were determined to not be hepatotoxic. Most of the compounds belong to class III of acute oral toxicity, which means that according to US EPA classification, their LD 50 values are greater than 500 mg/kg but less than 5000 mg/kg. Compounds two and nine belonged to category II of acute oral toxicity and their LD 50 values are possibly greater than 50 mg/kg but less than 500 mg/kg. Table 3. Toxicity prediction for compounds 1-9. These results provide essential information regarding toxicological profile of compounds one to nine and may be useful in selecting of route of administration and preferred dosage form. However, different values of probability indicated that these studies are preliminary and should be confirmed experimentally.

Database Search and Library Establishment
Different compound databases were searched for the availability of dihydroisoquinoline analogues. For this study, ZINC [19], PubChem [20], and DrugBank [21] were considered. The ZINC database substructure search of 3,4-dihydroisoquinoline reported 12,286 compounds (from 200,000,000 substances). After the removal of duplicates, 12,260 compounds were selected. Then, these compounds were filtered with parameters based on Lipinski's "rule of five" [22,23], and finally a group of 11,585 compounds was obtained. The PubChem database (containing 102,628,457 compounds) provided 21,769 compounds and after consideration of Lipinski's "rule of five", 13,579 compounds were identified. A comparison of the obtained ZINC and PubChem sets of compounds revealed 600 duplicates, which were removed from the PubChem database, and after comparison with the ZINC database, only 13,011 were taken for further study. DrugBank database (including 13 491 compounds) did not reveal any significant hits. Finally, approximately 11,585 compounds from ZINC and 13,011 compounds from PubChem were considered for molecular docking and further in silico studies.

Ligand Growing Experiment
For the ligand growing experiment, Spark 10.5.6 software (Cresset ® , Litlington, Cambridgeshire, UK) was used [24,25]. As a starter molecule, diethyl 6,8-dibenzyloxy-3,4,-dihydroisoquinoline-3,3-dicarboxylate in its bioactive conformation was chosen. Bestatin was loaded as a reference molecule to guide the ligand growth. The starter and reference molecules were pre-aligned in 3D. The score weight was set at 20% for the starter molecule, and at 80% for the reference molecule. A low score weight prevents large movements of the new molecule relative to the starter molecule. We then imported a protein, LAP (PDB code: 1LAN) to use it as an "excluded volume" around the starter molecule. During the Spark search, the excluded volume was checked to assess whether the replacement fragments clash with it. Four regions of the starter molecule were selected to be replaced ( Figure 2). Each portion of the starter molecule was selected to be removed and replaced by the fragment from the databases. Ligand growing calculation method was used. Spark used three databases: ChEMBL_common (from literature reports) with 58,924 fragments and Commercial Very Common and Common (from commercially available compounds) with 20,561 and 42,778 fragments, respectively. Spark then searched these databases for any fragment with the correct number of connection points that geometrically match the broken bonds in the target. Matches were considered on the basis of both angle and distance. Fragments with the correct angles and distances between the connection points were merged into the target molecule to form the "product". The latter was scored against the starter structure for electrostatic and shape similarity.

Molecular Docking
The crystal structure of blLAP complexed with l-leucinal (PDB code: 1LAN) was used for docking performed with ICM-Pro software (Molsoft LLC, San Diego, USA) [27][28][29]. All water molecules and co-crystallized ligands were removed from the PDB structure. Ligands were docked using a regular rigid receptor-flexible ligand docking approach that uses five potential energy maps. The maps were generated in a rectangular box with 0.5 Å grid spacing centered at the ligand binding site. Each molecule was first subjected to a conformational analysis outside of the protein pocket and a stack of low energy conformations is collected and used as starting geometries for grid docking. Ligand binding modes were scored according to the quality of the complex, and a user-defined number of the top scoring poses was re-ranked using the full ICM scoring function. The predicted score was calculated as the weighted sum of ligand-target van der Waals interactions and internal force field energy of the ligand, free energy changes due to conformational energy loss upon ligand binding, hydrogen bonding interactions, hydrogen bond donor-acceptor desolvation energy, solvation electrostatic energy upon ligand binding, hydrophobic free energy gain, and a size correction term proportional to the number of ligand atoms.

VS with Lead Finder
The full-atom model of LAP used in the current study was prepared from the PDB structure by adding hydrogen atoms and assigning ionization states of the amino acids with the Model Builder (build_model) program of the Lead Finder software package v 1.1.13 (BioMolTech ® , Toronto, Ontario, Canada) [33,34]. VS of ligands to the prepared model of 1LAN and binding energy calculations were performed with Lead Finder v. 1.1.13 software using its default configuration parameters. The energy grid box for ligand docking was set at the geometrical center of the reference ligand to span 6 Å in each direction. Lead Finder assumes that the protein and ligand structures are rigid; however, it analyzes possible conformations of ligand structures by rotating functional groups along each freely rotatable bond (FRB). For each ligand pose Lead Finder determines values of the free energy of binding, the VS score, and, if applicable, the pose ranking score by using its three built-in scoring functions.

3D-QSAR
All the optimized structures were imported into Forge software (10.6.0, Cresset ® , Litlington, Cambridgeshire, UK) [35] for the evaluation of the dataset in the field-based 3D-QSAR model that was previously published [18]. All the molecules were aligned with the training set of the 3D-QSAR model. The negative, positive, shape, and hydrophobic field points of each molecule were generated using the extended electron distribution (XED) force field in Forge. Partial least squares (PLS) regression analysis was used to confirm the predictive ability of the field-based 3D-QSAR model. Statistical parameters, such as non-cross-validated correlation coefficient (r 2 ), cross-validated correlation coefficient (q 2 ), and RMSE pred were calculated. The maximum number of components (N) was fixed to 20, while the maximum distance for sample point was set to 1Å. The LOO and LMO cross-validation methods were used. The parameters used in the conformation search, alignment, and model building are reported in Supplementary Materials Figures S1-S3.

In Silico Drug-Likeness, Bioavailability, and Toxicology Prediction
The SwissADME web tool was used to evaluate compounds' ADME [36]. It was accessed at http://www.swissadme.ch. Two dimensional (2D) chemical structures of the selected compounds were drawn and transferred as a list of molecules, defined by canonical simplified molecular input line entry specification (SMILES). All descriptors and molecular parameters were computed through the OpenBabel API (version 2.3.0, 2012). The pharmacokinetics section proposed one linear method for skin permeation, which relied on the quantitative structure-property relationship (QSPR) model developed by Potts and Guy that links the decimal logarithm of K p with MW and logP [36,37]. Multiple linear regression was used to predict K p . The prediction for passive human GI absorption (HIA) and BBB penetration consisted of the readout of the BOILED-Egg model, an intuitive graphical classification model.
For prediction of toxicological profile of selected compounds, we used admetSAR (version 2.0) online tool. It can be accessed at http://lmmd.ecust.edu.cn/admetsar2 [44,45]. Models used in the study were described in detail on the mentioned source file.

Conclusions
The cumulative risk of cancer incidence indicates that 1 in 8 men and 1 in 10 women will develop cancer in their lifetime [2]. When coupled with the estimated cost of US $1.16 trillion per year for cancer treatment and care, this clearly makes cancer a public health priority [1]. Despite significant progress in anti-cancer therapies, some cancers continue to have poor prognosis, which necessitates the development of new chemical entities. Considering that the process of introducing a new drug in the market is very difficult and time and money consuming, pharmaceutical companies and researchers pay much attention to computer-aided drug discovery techniques in various stages of drug discovery and development to minimize the failure rate [9]. Despite some disadvantages of in silico drug design methods such as inaccuracy of scoring functions for the evaluation of target-ligand binding free energy, difficulty of considering target flexibility in docking, or high false positive rate of VS [8,46], these methods have already helped to successfully introduce in market drug molecules, such as imatinib and other tyrosine kinase inhibitors [47], which have revolutionized leukemia treatment and enabled to treat cancer as a chronic disease like diabetes [48].
In the present study, the screening of new potential LAP inhibitors with a scaffold based on 3,4-dihydroisoquinoline from two databases: ZINC and PubChem as well as drug design using Spark software was performed. Through a statistical/computational filtration approach based on 2D descriptors, docking calculations, and 3D-QSAR statistical models, several and potentially active 3,4-dihydroisoquinoline derivatives were obtained. Finally, nine selected compounds with good drug-like, ADME pharmacokinetics parameters and toxicological profile were provided. Especially, compound one occurred to be a valuable starting point for the design of new synthetic derivatives with improved activity. The next steps will include the chemical synthesis of selected compounds, in vitro studies to experimentally confirm their inhibitory activity towards LAP and antiproliferative activity on cancer cell lines. In conclusion, these studies indicated that compounds with 3,4-dihydroisoquinoline moiety have potential to inhibit LAP and further studies on this topic are needed.
Supplementary Materials: The following are available online, Figure S1: Forge's parameters used for the conformation hunt, Figure S2: Forge's parameters used for the alignment, Figure S3: Forge's parameters used for building a model, Figure S4: Linear regression plot of experimental versus calculated pIC 50 values used in 3D-QSAR model, Table S1: Chemical structures, pIC 50 , ICM and LF score values for the compounds with an excellent description by the model.