Comparative Proteomics and Genome-Wide Druggability Analyses Prioritized Promising Therapeutic Targets against Drug-Resistant Leishmania tropica

Leishmania tropica is a tropical parasite causing cutaneous leishmaniasis (CL) in humans. Leishmaniasis is a serious public health threat, affecting an estimated 350 million people in 98 countries. The global rise in antileishmanial drug resistance has triggered the need to explore novel therapeutic strategies against this parasite. In the present study, we utilized the recently available multidrug resistant L. tropica strain proteome data repository to identify alternative therapeutic drug targets based on comparative subtractive proteomic and druggability analyses. Additionally, small drug-like compounds were scanned against novel targets based on virtual screening and ADME profiling. The analysis unveiled 496 essential cellular proteins of L. tropica that were nonhomologous to the human proteome set. The druggability analyses prioritized nine parasite-specific druggable proteins essential for the parasite’s basic cellular survival, growth, and virulence. These prioritized proteins were identified to have appropriate binding pockets to anchor small drug-like compounds. Among these, UDPase and PCNA were prioritized as the top-ranked druggable proteins. The pharmacophore-based virtual screening and ADME profiling predicted MolPort-000-730-162 and MolPort-020-232-354 as the top hit drug-like compounds from the Pharmit resource to inhibit L. tropica UDPase and PCNA, respectively. The alternative drug targets and drug-like molecules predicted in the current study lay the groundwork for developing novel antileishmanial therapies.


Introduction
Leishmaniasis is a neglected tropical disease caused by leishmania species, i.e., a sandfly vector-borne parasitic kinetoplastid protozoa [1]. The three major pathological forms of leishmaniasis are kala-azar or visceral leishmaniasis (VL), cutaneous leishmaniasis (CL), and mucocutaneous leishmaniasis (MCL). The most lethal one is VL, characterized by severe inflammatory reactions in the spleen and liver, which might be fatal. CL is the most frequently prevalent that infects the epidermal layer of the skin and leads to disfiguring lesions [2]. Leishmaniasis poses a threat to an estimated 350 million individuals in 98 countries. Globally, 12 million cases are reported annually, with an average of 2 to

Protein Sequence Retrieval
The protein sequences of mutidrug-resistant L. tropica axenic amastigotes were retrieved from the TriTrypDB-kinetoplastid informatics resource in FASTA format [15]. CD-HIT clustering analysis was conducted to eliminate paralogous proteins with sequence similarity criteria of ≥80% [16].

Essential Proteins Identification
The nonparalogous protein sequences were scanned against the eukaryotic database of essential genes (DEG). The DEG database contains experimentally validated essential genes from eukaryotes [17]. The standalone BLASTp program [18] was used to screen DEG against pathogen protein sequences with an E-value threshold of 10 −4 , bit score ≥100, sequence identity ≥35%, and query coverage ≥35%.

Human Host and Gut Nonhomolog Proteins Identification
The protein sequences were then subjected to BLASTp against the human proteome set acquired from the NCBI database [19] to identify host nonhomologous proteins. Protein sequences with a high degree of homology with the human proteome were eliminated, and human nonhomologous pathogen proteins were prioritized. The resultant nonhomologous proteins were scanned against human gut flora proteins [20] to avoid sequences showing homology to the gut microbiota. The threshold criteria for this BLASTp analysis were ≤35% sequence identity, ≤35% query coverage, and an E-value cut off 10 −4 .

Human Host and Gut Nonhomolog Proteins Identification
The protein sequences were then subjected to BLASTp against the human proteome set acquired from the NCBI database [19] to identify host nonhomologous proteins. Protein sequences with a high degree of homology with the human proteome were eliminated, and human nonhomologous pathogen proteins were prioritized. The resultant nonhomologous proteins were scanned against human gut flora proteins [20] to avoid sequences showing homology to the gut microbiota. The threshold criteria for this BLASTp analysis were ≤35% sequence identity, ≤35% query coverage, and an E-value cut off 10 −4 .

DrugBank Database Screening
The updated DrugBank database was screened to identify novel drug targets of L. tropica. A threshold of ≤35% sequence identity and query coverage was applied in this analysis [21]. Parasite proteins depicted as nonhomologous to the DrugBank were listed as alternative drug targets based on the set threshold criteria.

Structure Homologs Search
The PDB database was screened to identify structural information of the prioritized druggable proteins. Parasite proteins holding no structural information in PDB were modeled using the Swiss model by template base homology modeling [22]. The modeled 3D structures were then verified for accuracy using ERRAT and RAMPAGE [23].

Druggability Analyses
Several criteria are used to determine potential therapeutic targets, including molecular weight, function, cellular localization, and virulence factors [24]. The shortlisted prioritized proteins from the above analyses were tested for druggability potential. The subcellular localization of these target proteins was determined using PSORTb v.3.0.2 [25] and CELLO2GO V.2.5 [26] web servers. The drug molecule binding pockets of the targets were identified using PockDrug-server [27].

Pharmacophore-Based Virtual Screening
The Pharmit server was utilized for the pharmacophore-based virtual screening of millions of compounds from the built-in databases, including Molprot, ZINC, ChEMBL, and PubChem [28]. The 3D structure of the UDP-glucose pyrophosphorylase protein (PDB ID: 5NZM) and proliferating cell nuclear antigen (PCNA) (PDB ID: 6J0J) proteins were used for virtual screening. A total of nine pharmacophore features were enabled during the screening, including three aromatic and six hydrophobic features. The results were minimized to a significant level based on the RMSD value to shortlist the top-ranked inhibitors from the millions of drug-like compounds. The top ten hit compounds were later docked in the binding pocket of UDP-glucose pyrophosphorylase (LTRL590_180015300) and PCNA (LTRL590_150020700) using the CB-Dock tool to check their binding feasibility [29]. Proteinligand interactions were visualized using the Discovery Studio Visualizer [29].

ADME Analysis
The absorption, distribution, metabolism, and elimination (ADME) parameters of the top lead compounds were predicted using SwissADME [30]. ADME provides essential information regarding the drug-like capabilities of compounds. The SMILES format of the compounds was used as input to calculate the ADME parameters, physicochemical descriptors, pharmacokinetic features, and the drug-like nature of the lead compounds.

Essential Proteins Identification
A total of 8462 nonparalogous complete protein sequences of L. tropica strain L590 were subjected to BLASTp against the DEG database to identify pathogen-essential proteins. Essential proteins are indispensable for performing key cellular functions and pathogen survival [31]. The analyses identified 1225 pathogen protein homologs to the DEG entries (Supplementary File S1).

Human Host Nonhomologous Proteins
Pathogen-essential proteins are considered potential drug targets. However, these proteins must be nonhomologous to human proteomes and human gut microbiota proteins to avoid the adverse effects of drugs [32]. The nonparalogous pathogen-essential proteins were scanned against the human proteome. The analysis identified 727 pathogen proteins that were nonhomologous to the human host (Supplementary File S2). Additional comparative sequence scanning against the human gut flora proteome set prioritized 496 pathogen proteins that were nonhomologous to the human gut flora proteome (Supplementary File S3).

DrugBank Database Scanning
The 496 L. tropica essential proteins prioritized in the above steps were scanned against the DrugBank database. Pathogen proteins depicting no homologies to the known drug targets in the DrugBank database may be predicted as new or alternative drug targets [33]. DrugBank database scanning identified 19 already reported and 477 as DrugBank nonhomolog targets among the 496 pathogen-prioritized proteins.

Druggability Analysis
DrugBank nonhomologous proteins from L. tropica were subjected to subcellular localization analysis. Subcellular localization prediction is a key aspect of druggability analysis, and cytoplasmic proteins are considered ideal drug targets [34,35] (Figure 2). Therefore, the predicted cytoplasmic proteins were scanned against the PDB database to determine the three-dimensional (3D) structural information of the shortlisted pathogen proteins. However, none of the prioritized protein's 3D structural information was available in PDB; therefore, the shortlisted proteins' three-dimensional (3D) structures were modeled via homology modeling using their close structural homologs from PDB as a template. Eighteen L. tropica proteins were shortlisted for further druggability analysis. The Swiss model predicted the 3D structures of these proteins and was validated by the ERRAT tool with a quality factor score of >85%, representing high-quality protein models. Likewise, the Ramachandran plot predicted >85% to 90% residues of the modeled structures in the plot's allowed regions and ensured the proteins 3D structure model's accuracy [36]. Finally, nine proteins were prioritized based on pocket druggability analysis scores of >0.5 (Table 1).

DrugBank Database Scanning
The 496 L. tropica essential proteins prioritized in the above steps were scanned against the DrugBank database. Pathogen proteins depicting no homologies to the known drug targets in the DrugBank database may be predicted as new or alternative drug targets [33]. DrugBank database scanning identified 19 already reported and 477 as Drug-Bank nonhomolog targets among the 496 pathogen-prioritized proteins.

Druggability Analysis
DrugBank nonhomologous proteins from L. tropica were subjected to subcellular localization analysis. Subcellular localization prediction is a key aspect of druggability analysis, and cytoplasmic proteins are considered ideal drug targets [34,35] (Figure 2). Therefore, the predicted cytoplasmic proteins were scanned against the PDB database to determine the three-dimensional (3D) structural information of the shortlisted pathogen proteins. However, none of the prioritized protein's 3D structural information was available in PDB; therefore, the shortlisted proteins' three-dimensional (3D) structures were modeled via homology modeling using their close structural homologs from PDB as a template. Eighteen L. tropica proteins were shortlisted for further druggability analysis. The Swiss model predicted the 3D structures of these proteins and was validated by the ER-RAT tool with a quality factor score of >85%, representing high-quality protein models. Likewise, the Ramachandran plot predicted >85% to 90% residues of the modeled structures in the plot's allowed regions and ensured the proteins 3D structure model's accuracy [36]. Finally, nine proteins were prioritized based on pocket druggability analysis scores of >0.5 (Table 1).

Figure 2.
The subcellular localization prediction of the L. tropica essential, human nonhomolog proteins. Table 1. The molecular modeling, 3D structure validation, and druggability analysis of the prioritized shortlisted proteins of L. tropica.

Pharmacophore-Based Virtual Screening
Among the shortlisted drug target candidate proteins, UDP-glucose pyrophosphorylase (UGPase) (LTRL590_180015300) was prioritized for virtual screening based on druggability analysis and close structural homolog availability from PDB. The LtUG-Pase showed 100% query coverage and 98% sequences similarity with the Leishmania major LmUGP-murrayamine-I complex (PDB ID: 5NZM), available in PDB. So far, no commercial inhibitors or antileishmanial drugs have been reported based on the UDPase target. LmUGPase from L. major was used as a template for the homology modeling of L. tropica UGPase (LtUGPase). The pharmacophore model designed for the 3D structure of LtUGPase showed three aromatics and six hydrophobic features ( Figure 3A). The top-10-hit small molecules were selected based on the docking score and RMSD values ( Table 2). These 10 compounds were then subjected to molecular docking against LtUGPase to calculate the binding energies. The active site of LmUGP-murrayamine-I complex comprises four key residues: Arg248, Val371, Pro372, and Arg373 [37]. This information was used for structure-based pharmacophore model designing and drug-like compound databases virtual screening. The top-10-hit compounds were subjected to molecular docking analysis to evaluate the binding conformation of the lead compounds within the active site of the receptor molecule. The MolPort-000-730-162 compound showed significant molecular interactions with the conserved residues (370-380) in the C-terminus of LtUGPase and ranked as the top hit ( Figure 3B). Additionally, the docking analysis predicted that all the top-screened compounds exhibited substantial molecular interactions with the receptor ( Table 2, Figures S1 and S2).    The enzyme LtPCNA showed 100% query coverage and 94% sequence similarity with the crystal structure of PCNA from L. donovani (PDB ID: 6J0J). PCNA from L. donovani was used as a template to model the 3D structure of the LtPCNA protein. A structurebased pharmacophore model was designed against the LtPCNA target that exhibited three hydrogen donor and one hydrogen acceptor pharmacophoric features ( Figure 4A). Pharmacophore-based virtual screening predicted several druggable compounds against the LtPCNA target from the Pharmit resource. The top 10 hits were prioritized as drug-like compounds based on the docking scores and RMSD values (Table 3). Molecular docking analysis was performed to calculate the binding energies of these 10 compounds with the residues in the active sites of LtPCNA. Among these, MolPort-020-232-354 was identified as the top lead compound, showing significant hydrogen bond interactions with the LYS193, PRO212, GLY223, and ASN224 residues of LtPCNA ( Figure 4B). Furthermore, molecular docking analysis anticipated substantial interactions of all the top-screened compounds with LtPCNA (Table 3, Figures S3 and S4).

ADME Analysis
The molecular properties of drug-like small molecules are crucial for the effective drug's design, synthesis, and clinical applications. The four most important pharmacokinetic parameters are absorption, distribution, metabolism, and excretion (ADME). A lead compound must follow the ADME criteria to be a successful drug [38]. Chemical descriptors based on physiological properties and chemical structures were used to calculate the pharmacokinetic properties of the top 10 hit compounds. Multiple physicochemical properties, including molecular weight, hydrogen bonding, hydrophobicity, reactivity, bioavailability, molecular stability, aqueous solubility values (logP and logS), skin permeability coefficient (log kp), gastrointestinal tract absorption (GI), and blood-brain barrier (BBB) were computed to evaluate the efficacy of these compounds (Tables S1 and S2). Lipinski's rule of five is one of the most effective models for evaluating a suitable drug based on the solubility and permeability of a compound [39]. Among the top 10 lead compounds, C2-C7 and C10 showed drug-like properties based on Lipinski's rule of five, whereas C1, C8, and C9 were found to violate the rules. The compounds C1, C8, and C9 showed low gastrointestinal absorption, whereas C2-C7 and C10 exhibited high gastrointestinal absorption. All compounds showed low penetrability through the blood-brain barrier (BBB), except compound C5, which was predicted to be able to cross this barrier. All these compounds were substrates for the p-glycoprotein, which plays an important role in pumping xenobiotics and harmful substances back into the gut lumen, focusing on the propensity of these molecules to be potential inhibitors against LtUGPase [40]. None of these compounds showed any ADME toxicity or mutagenicity. However, compounds C1, C8, and C9 require additional chemical modification to fulfill drug-like properties (Table S1).
The top hit, drug-like compounds against the LtPCNA target are shown in Supplementary Table S2. The compounds C1, C2, C4, and C7-C9 demonstrated drug-like properties based on Lipinski's rule of five, whereas the compounds C3, C5, C6, and C10 violated Lipinski's rules. Based on the ADME results, the compounds C2, C4, C5, and C7-C9 exhibited higher gastrointestinal absorption, whereas the compounds C1, C3, C6, and C10 were predicted to have low gastrointestinal absorption. All the compounds were predicted to show low penetrability through the blood-brain barrier (BBB). Compounds C1-C4 and C9 were not predicted as p-glycoprotein substrates, whereas the compounds C5-C8 and C10 were predicted as p-glycoprotein substrates. There was no indication of ADME toxicity or mutagenicity for any of these compounds. Compounds C3, C5, C6, and C10 need additional chemical modifications to possibly exhibit more potent drug-like properties.

Discussion
Leishmaniasis has been listed as one of the most neglected tropical diseases by the World Health Organization (WHO), for which the development of new therapeutic strategies has become indispensable [41]. The present study analyzed the L. tropica proteome to identify suitable therapeutic targets against drug-resistant leishmaniasis. We followed a subtractive proteomic analysis approach to identify the L. tropica essential and human host nonhomologous proteins. According to the law of centrality and lethality, the functional perturbation of such proteins might be deleterious for the pathogen's survival [42,43]. These proteins were further shortlisted based on strict druggability criteria.
Among the shortlisted novel drug targets, UDP-glucose pyrophosphorylase (UGPase) (LTRL590_180015300) catalyzes the conversion of -D-glucose-1-phosphate (Glc-1P) and UTP into UDP-glucose (UDP-Glc), an important metabolite in the carbohydrate pathway of all organisms [44]. UDP-Glc is interconverted into UDP-Gal by catalytic reaction of UDP-Glc 4'-epimerase, which is connected to the biosynthesis of nucleotide sugars, confirming the role of this enzyme in galactose salvage, thus essential for parasite growth [45,46]. UGPase is considered a virulence factor, as it is necessary to manufacture cell surface glycoconjugates [47]. Leishmania species express a variety of glycoconjugates on their cell surfaces, which are constantly changing throughout the life cycle of these species. These glycoconjugates are associated with the survival and proliferation of parasites in the insect and the mammalian host [48,49]. A combination of gene deletion and protein destabilization techniques reported that reducing the level of UDPase to a minimum results in growth arrest and, ultimately, the cell death of L. major [50]. The structural analysis indicated that understanding the catalytic mechanism of UGPases can provide a template for designing species-specific UGPase inhibitors [47]. UGPase has been reported recently as a drug target against Leishmania species [51]. However, no potential drug is commercially available based on UGPase inhibition. Therefore, based on pocket druggability analysis and structural validation scores, the L. tropica UDPase protein (LtUGPase) was prioritized in the current study for virtual screening to identify drug-like small molecules to possibly inhibit the activity of this protein. These analyses and ADME profiling predicted MolPort-000-730-162 as a top-ranked molecule among the top-10-hit compounds that may inhibit LtUGPase. Experimental and clinical assays are required for further validation of these results.
Proliferative cell nuclear antigen (PCNA) putative protein (LTRL590_150020700) has also been found among the finally shortlisted druggable protein targets. PCNA plays a vital role in the DNA metabolism process in eukaryotes. PCNA augments polymerases' processivity by providing an anchor point for DNA polymerases and acting as a DNA sliding clamp protein. PCNA is necessary for DNA repair and recombination as well [52,53]. Studies have shown that PCNA is highly expressed in drug-sensitive and drug-resistant L. donovani strains in clinical isolates [54]. Pentavalent antimony is a commonly used drug against Leishmania infection. PCNA has a critical role in avoiding DNA damage caused by antimonials and is reported in association with antimony resistance in Leishmania spps [55,56]. Several lead inhibitors are predicted in the current study against the L. tropica PCNA target. Additional in vivo and in vitro assays are required to validate the antileishmanial efficacy of these predicted lead compounds.
3-mercaptopyruvate sulfurtransferase (LTRL590_050014400), identified as a potential drug target in the current study, is involved in thioredoxin and antioxidant metabolism [57]. This protein is reported to tolerate oxidative stress, i.e., caused by the host's immune system in Leishmania species [58]. It is responsible for the control of reactive oxygen species cytotoxicity in aerobic tissues [59]. The LTRL590 070011300 gene encodes a highly conserved ribosomal protein and is also shortlisted as a drug target in the current study. This ribosomal protein is a part of the 60S subunit of the ribosome responsible for rRNAs and protein domains unique to kinetoplastid ribosomes [60]. Keeping in view the evolutionary distinction between Leishmania and humans, the Leishmanial ribosome structure may open up new possibilities for developing potent antileishmanial therapies [61].
Tyrosyl-tRNA synthetase putative (TyrRS) (LTRL590_140021400) has also been shortlisted in the current analysis. TyrRS belongs to a family of aminoacyl-tRNA synthetases (aaRSs) involved in essential biological processes, including nucleotide binding, aminoacyl-tRNA ligase activity, tRNA aminoacylation for protein translation, and cellular proliferation [62]. TyrRS has been reported to play a possible immunomodulating role by inducing a proparasitic response and inflammatory phagocyte recruitment during Leishmania infections [63]. The E2-like ubiquitin conjugation enzyme (LTRL590_150018200) is also prioritized as a potential drug target that participates in the ubiquitination pathway and regulates various cellular functions [64]. Several ubiquitination pathway proteins have previously been reported as potential drug targets in trypanosomatid diseases [65,66].
Protein tyrosine phosphatase-like protein (PTP) (LTRL590_160007700) has also been shortlisted as a therapeutic target in the current analysis. PTPs have been implicated in controlling various cell activities, including eukaryotic cell proliferation and differentiation [67]. Leishmania PTP is involved in metabolic pathways that are involved in amastigote survival, differentiation, and enhancing virulence and disease progression in the host [68,69]. Structural analysis of Leishmania PTP shows uniquely remarkable conservation in the active sites of these proteins [70]. This information could be used to develop novel inhibitors to target parasite-specific PTPs.
ADP-ribosylation factor-like ARL protein 1 (LTRL590_170005800) is a tiny cytoplasmic guanosine-5 -triphosphate (GTP)-binding protein that has been associated with a variety of endo-and exocytic vesicular transport processes [71]. ARF undergoes conformational changes when binding with nucleotide, which modulates the affinity of ARL to bind with other proteins, lipids, or membranes [72]. The ARF protein is essential in vesicular trafficking and structural maintenance of the Golgi network in the eukaryotic cell [73]. ARF proteins are highly conserved among the trypanosomatids [74] and are associated with intracellular trafficking [75]. Inhibiting the activity of this protein could halt the pathogenesis of L. tropica and assist in developing novel therapeutic reagents.
The ADP/ATP carrier protein 1 (AAC), mitochondrial precursor, putative (LTRL590_190006600) protein is involved with the essential metabolic process of transporting ADP into the mitochondrial matrix and ATP to fuel the cell by maintaining high cytosolic ATP concentrations for energy-requiring metabolic processes [76]. Mitochondrial carrier proteins have been reported as drug targets against Trypanosoma brucei and Saccharomyces cerevisiae [77]. However, this protein has not been reported before as a promising drug target against Leishmania species. The analyses of the current study predict this protein as the promising druggable candidate to combat L. tropica infection.
The druggable proteins prioritized in the current study are highly conserved and important for the survival, growth, and virulence of the Leishmania parasite within the host. These proteins may therefore implicate potential therapeutic targets in the future. The exploration of refining the 3D structures of these proteins may lay the groundwork for suitable antileishmanial drug development. The in silico approaches used in this study could pave the way to identify novel therapeutic targets and develop species-specific potent drugs that aid in eliminating many parasitic diseases.

Conclusions
In the present study, we used the entire proteome data of drug-resistant L. tropica to determine new and potent therapeutic targets against L. tropica infection. A comparative subtractive genome approach was used to identify parasite-specific essential genes involved in the metabolic pathways responsible for pathogen survival, proliferation, and virulence. In silico druggability analysis prioritized several novel drug targets against L. tropica that have not been previously reported. Pharmacophore-based virtual screening of updated biological databases, ADME evaluation, and docking studies prioritized several drug-like small compounds against newly identified LtUGPase and LtPCNA targets that may be tested in the future with respect to antileishmaniasis activity.