Design of Inhibitors That Target the Menin–Mixed-Lineage Leukemia Interaction

The prognosis of mixed-lineage leukemia (MLL) has remained a significant health concern, especially for infants. The minimal treatments available for this aggressive type of leukemia has been an ongoing problem. Chromosomal translocations of the KMT2A gene are known as MLL, which expresses MLL fusion proteins. A protein called menin is an important oncogenic cofactor for these MLL fusion proteins, thus providing a new avenue for treatments against this subset of acute leukemias. In this study, we report results using the structure-based drug design (SBDD) approach to discover potential novel MLL-mediated leukemia inhibitors from natural products against menin. The three-dimensional (3D) protein model was derived from Protein Databank (Protein ID: 4GQ4), and EasyModeller 4.0 and I-TASSER were used to fix missing residues during rebuilding. Out of the ten protein models generated (five from EasyModeller and I-TASSER each), one model was selected. The selected model demonstrated the most reasonable quality and had 75.5% of residues in the most favored regions, 18.3% of residues in additionally allowed regions, 3.3% of residues in generously allowed regions, and 2.9% of residues in disallowed regions. A ligand library containing 25,131 ligands from a Chinese database was virtually screened using AutoDock Vina, in addition to three known menin inhibitors. The top 10 compounds including ZINC000103526876, ZINC000095913861, ZINC000095912705, ZINC000085530497, ZINC000095912718, ZINC000070451048, ZINC000085530488, ZINC000095912706, ZINC000103580868, and ZINC000103584057 had binding energies of −11.0, −10.7, −10.6, −10.2, −10.2, −9.9, −9.9, −9.9, −9.9, and −9.9 kcal/mol, respectively. To confirm the stability of the menin–ligand complexes and the binding mechanisms, molecular dynamics simulations including molecular mechanics Poisson–Boltzmann surface area (MM/PBSA) computations were performed. The amino acid residues that were found to be potentially crucial in ligand binding included Phe243, Met283, Cys246, Tyr281, Ala247, Ser160, Asn287, Asp185, Ser183, Tyr328, Asn249, His186, Leu182, Ile248, and Pro250. MI-2-2 and PubChem CIDs 71777742 and 36294 were shown to possess anti-menin properties; thus, this justifies a need to experimentally determine the activity of the identified compounds. The compounds identified herein were found to have good pharmacological profiles and had negligible toxicity. Additionally, these compounds were predicted as antileukemic, antineoplastic, chemopreventive, and apoptotic agents. The 10 natural compounds can be further explored as potential novel agents for the effective treatment of MLL-mediated leukemia.


Introduction
Chromosomal translocations that involve the KMT2A gene, which is also known as the mixed-lineage leukemia 1 (MLL1) gene, are the source of a portion of cases of acute leukemia [1].This chromosomal rearrangement causes the production of MLL fusion proteins, also known as MLL-FPs, and causes their proliferation [2].These translocations are found in around 10% of all acute leukemia cases [3].Mixed-lineage leukemia (MLL)mediated leukemia affects both children and adults; however, it has a prevalence of 70 to 80% in infants [4,5].The prognosis for this disease is poor for infants, especially those with primarily acute lymphoblastic leukemias, which comprise approximately 10% of cases and 2.8% of acute myeloid leukemia cases [3,6].New medicines are urgently needed for the aggressive kind of leukemia caused by MLL1 gene translocation, which affects 5-10% of patients with acute leukemia [7].Higher rates of relapse and resistance to chemotherapy are two additional effects of the MLL1 mutation [8].As a result, the prospects for patients with MLL1 rearrangements continue to be dismal compared to those for patients with other types of leukemia.
Within the context of acute leukemia, the menin protein serves as an essential oncogenic cofactor for MLL-FPs [9].MLL-FPs are distinguished by an N-terminal of DNA-interacting domains of MLL that, when in frame, fuses with the C-terminal of the fusion partner.It has been found that the part of the MLL N-terminus that binds to menin is very important for leukemogenic transformation [10].Menin is required for the regulation of MLL target genes including HOXA9 and MEIS1, and acts as a highly specific binding partner of MLL-FPs [11].These genes have been identified to operate in synergy to ensure the maintenance and survival of leukemic cells [12,13].In vitro and in vivo experiments have shown that the oncogenic ability of MLL-FPs is abolished when they are unable to bind menin.This has established menin as a critical oncogenic cofactor of MLL-FPs necessary for their leukemogenic activity [7,14].For MLL-FP leukemogenesis, it is crucial to have the interaction between the MLL amino-terminal sequence and menin [9].Therefore, the discovery of menin inhibitors will provide an avenue for new therapies for MLL.
Currently, there are few therapies that have been identified for MLL-mediated leukemia that are non-toxic [15].Menin inhibitors are a promising novel therapy and have been found to be most successful for KMT2Ar and NPM1-mutated acute leukemias.However, it is possible that additional subsets of leukemia respond to these treatments as well [9].The search for new therapies for acute leukemia is currently underway; no fewer than four menin inhibitors are being used clinically after preclinical trials, two of which are SNDX-5613 and KO-539.These inhibitors are currently being tested to determine the tolerability and response rates in patients [9].This shows that there is a need to identify menin inhibitors as potential therapeutic solutions to MLL-mediated leukemia.
In an effort to identify natural compounds that can be turned into chemotherapies for the treatment of MLL-mediated leukemia, this study aimed to identify menin inhibitors from the Traditional Chinese Medicine (TCM) database [16] via docking.In a previous study that also employed both in silico and in vitro methods, 5 out of 74 compounds tested were found to be potential inhibitors of menin-MLL, with DCZ_M123 having the greatest inhibitory efficacy with an IC 50 of 4.71 ± 0.12 μM [8].The literature has suggested that menin is a promising target; therefore, research on finding novel drugs to fight against MLL-mediated leukemia is still needed [17,18].
Computer-aided drug design has become one of the most cost-effective and reliable approaches to discovering novel therapies [19].As previously mentioned, virtual screens of known inhibitors targeting the menin-MLL interface have successfully been performed.However, more research on novel leads is vital to combating MLL-mediated acute leukemia.The purpose of this research is to use computational methods to shortlist potential menin inhibitors derived from natural products.The natural product library of 25,131 compounds was docked against menin using AutoDock Vina [20].The potential lead compounds were then evaluated to determine if they are anti-cancer drugs using Way2Drug [21].Then, molecular dynamics (MD) simulations were used to further evaluate the stability of top compounds.In addition, the study compared the binding affinities of known inhibitors to those of potential lead compounds.

Materials and Methods
This study employed several computational techniques including molecular docking, molecular dynamics simulations, and molecular mechanics Poisson-Boltzmann surface area calculations to identify potential menin inhibitors (Figure 1).Prior to the docking stage, the ability of AutoDock Vina to effectively predict menin binders was evaluated.The shortlisted compounds were further subjected to ADMET screening and biological activity predictions (Figure 1).

Rebuilding the Menin Structure
Two freely available protein structure modeling platforms comprising EasyModeller and I-TASSER were employed to predict a reasonably good and complete structure of menin.For each modeling platform, five structures were generated.EasyModeller 4.0 was used to generate five models of menin using the 4GQ4 structure as a template.The five structures were assessed based on their discrete optimized protein energy (DOPE) scores and the structure with the lowest DOPE score was selected.
The complete amino acid sequence of menin retrieved from UniProt was used as input for I-TASSER modeling.Using the 3U84 structure, I-TASSER also predicted five structures of complete menin [27][28][29].The best I-TASSER-generated structure was selected based on the C-score.The C-score is the confidence quantitative measure for each model that is produced from I-TASSER.The value is based on the significance of threading template alignments and convergence parameters of structure assembly simulations [27][28][29][30].A higher value of the C-score usually indicates a more reliable protein model and it normally ranges from −5 to 2 [31].

Model Quality Assessment
The quality of the top 2 models, 1 from each technique, was evaluated using SAVESv6.0 (https://servicesn.mbi.ucla.edu/SAVES/,accessed on 10 February 2023).SAVESv6.0 is a meta-server that runs quality assessments on protein structures.SAVESv6.0 requires the submission of a protein file exclusively in PDB format.Following the upload of the protein structure, the user proceeds by selecting the "Run Programs" button, after which the relevant evaluation programs are initiated by clicking the respective "Start" buttons.Three programs, namely ERRAT [32], VERIFY [33,34], and PROCHECK [35], that can be run by SAVES v6.0 were considered for the quality assessments.Ramachandran plots were obtained via the PROCHECK program.The structure with the better evaluation outcome was selected as reasonably the best model and used for the study.

Obtaining Compounds
The ligand library was obtained from the TCM database [16], which contained 35,161 traditional Chinese medicine natural products.Two known menin inhibitors (PubChem CIDs: 71777742 and 36294) were obtained via PubChem.Another known menin inhibitor, MI-2-2, was extracted from the co-crystallized menin structure in the PDB database (Protein ID: 4GQ4) [11].The aim was to perform molecular docking studies on these known inhibitors to compare the results to those from the TCM ligand library to aid in prioritizing compounds as potential treatment options for MLL-mediated leukemia.

Protein and Ligand Library Preparation
The 35,161 ligands obtained from the TCM library [16] were pre-filtered using OSIRIS DataWarrior [36] to select only compounds that had molecular weights between 150 and 600 g/mol, as previously carried out [37][38][39][40].It was also used to remove duplicates.Following the filtering of the library with DataWarrior, the 25,196 compounds that were left were subjected to an energy minimization procedure utilizing the universal force field (UFF) [41,42] and conjugate gradient algorithm in 200 steps.The ligands were then converted into PDBQT format, which is supported by AutoDock Vina [20].
The selected protein model was subjected to energy minimization using two different force fields, the optimized potentials for liquid simulations (OPLS)/all-atom (AA) and CHARMM36 force fields.A short 20 ns MD simulation was also performed using the 2 force fields.This was performed in order to select the force field that would produce a lower energy and a more stable menin structure.The Xmgrace plotting tool was used to generate the graphs.Also, for the selected structure, clustering analysis via the "gmx cluster" command was used to group conformations from the trajectory based on their structural similarity, using the "gromos" method.The representative structure of the largest cluster was then selected.
To eliminate potential docking simulation interferences, the energy-minimized protein model was checked for the presence of water molecules and ions, and these were removed.After that, PyMOL 1.7.4.5 [43] was used to save the protein structure as a pdb file.The "make macromolecule" option in PyRx [44] was used to convert the protein structures into AutoDock Vina's pdbqt format, which was necessary for the virtual screening.

Virtual Screening
AutoDock Vina [20] via PyRx was used to run the virtual screening.A grid box dimension of 32.879 × 25.923 × 30.494Å 3 and the protein centered at x = 57.954Å, y = 57.286Å, and z = 58.794Å were used in order to cover the 0RT binding site.The protein was kept in a rigid conformation during the docking process while the ligands were treated as flexible entities during the docking simulations, allowing AutoDock Vina to generate up to 9 conformations or poses for each compound.Prior to the screening, the quality of the pose prediction conducted via AutoDock Vina was assessed by comparing the best-ranked pose to that in the experimental situation, as previously carried out [37,[45][46][47][48]. Four known menin inhibitors comprising MI-2 (0RO), MI-2-2 (0RT), MI-89, and MIV-6 were extracted from structures with the corresponding PDB IDs, 4GQ3, 4GQ4, 6E1A, and 4OG8, and were docked against the menin protein.The top-ranked pose, characterized by the most negative binding energy, was subsequently aligned structurally with the conformation in the crystallographic structure using the rigid module of LS-Align [49].The alignments were then assessed using the RMSD values.
The three known inhibitors and the filtered ligand library were virtually screened against menin.After virtual screening, compounds with binding energies of less than −9.0 kcal/mol were chosen for further investigation.Previous studies have shown that binding affinities that are −7.0 kcal/mol can distinguish between specific and non-specific protein-ligand bonds [50].All the compounds that passed the −9.0 kcal/mol threshold were analyzed with PyMOL 1.7.4.5 [43] and the best potential ligands that docked deeply in the binding site were studied further.The shortlisted compounds underwent re-docking against the best structure from the clustering analysis to ascertain their binding affinity to menin.

SwissADME Screening of Ligands
After a successful virtual screening, the pharmacological profiles of the selected ligands were obtained through the use of SwissADME [51].Lipinski and Veber's rules were used to ascertain whether or not these ligands with the highest binding affinities are drug-like.Toxicity risks including the mutagenicity, tumorigenicity, reproductive effect, and irritancy of these ligands were also analyzed using OSIRIS DataWarrior 5.5.0 [36].

Prediction of Biological Activity of Compounds
Prediction of activity spectra for substances (PASS) was utilized to ascertain the biological activities of the shortlisted compounds [21,52].Using a Bayesian approach, the PASS system is able to predict the biological activity of each compound using its simplified molecular line entry system (SMILES) format as input [21,52].

Determining Binding Interactions
LigPlot+ v.2.2 [53] was used to analyze the protein-ligand interactions using default parameters.This program showed hydrogen bonding and hydrophobic interactions between the protein and ligands.The menin-ligand complexes post-docking were saved in PDB format and uploaded to LigPlot+ v2.2.The ligands' interactions with menin were then visualized and analyzed.

MD Simulations of Protein-Ligand Complexes
For each system, three independent molecular dynamics (MD) runs were performed.To prepare the ligands for the MD simulations, the topologies of each ligand were generated using the CHARMM General Force Field (CGenFF) server [54].GROMACS version 5.1.5was used to run the MD simulations for the unbound menin and the protein-ligand complexes.Each system was solvated in a cubic box positioned 1.0 nm away from the protein on all sides.Sodium or chlorine ions were used to neutralize each system.After energy minimization, the complexes were equilibrated to achieve optimum conditions of pressure and temperature, and then 100 ns MD simulations were carried out.After the MD simulations, Xmgrace was used to generate graphs for analysis.

MM/PBSA Calculations of Protein-Ligand Complex
The molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) method was utilized to analyze the binding free energies of the menin-ligand complexes [55][56][57].This system provides the thermodynamic cycle, which incorporates both the molecular mechanical energies and the continuum solvent representation.The calculations for MM/ PBSA were conducted using g_mmpbsa [55].This program computes the contributing energy components for the complex and individual energies of the residues.Three MM/ PBSA calculations were performed on the outputs from the three independent MD runs using the full menin structure in complex with the shortlisted ligands.To plot the computational statistics, the R programing system [58] was employed.

Primary Structure Analysis
The protein sequence of menin is composed of 615 amino acid residues with UniProtKB ID O00255.ProtParam [59] was used to determine the physical and chemical parameters of menin including its molecular weight and chemical formula.its molecular weight was approximately 67,383.72 Da and its chemical formula is C 3003 H 4709 N 831 O 898 S 16 .

Protein Structure Identification-
The crystal structure of human menin with bound inhibitor MI-2-2 (PDB ID: 4GQ4) was solved using X-ray crystallography with a resolution of 1.27 Å [11].The leukemogenic effect of the MLL fusion protein can be inhibited using a menin inhibitor, which prevents the menin protein from interacting with MLL [8].MI-2 is the most potent inhibitor of the menin protein with an IC 50 of 446 nM.MI-2-2 was developed and the IC 50 was refined 10-fold.This inhibitor, however, did not show promising results in vivo due to its cellular activity and poor metabolic stability [60].From the literature, 4GQ4 has been used in other in silico studies with promising results [8].Thus, the 4GQ4 structure was selected as the main template for the menin protein.

Model Rebuilding Using
EasyModeller-To fix missing residues, the protein required additional modeling.EasyModeller 4.0 is a tool used for protein homology modeling and the optimization of the protein models generated [24].The menin protein (Protein Databank ID: 4GQ4) was solved using X-ray crystallography with a resolution of 1.27 Å.However, due to missing residues in the amino acid sequence in the 4GQ4 structure, the EasyModeller 4.0 program was used to reconstruct the protein structure.Five protein models were generated from EasyModeller with DOPE scores of −68,135.84375,−68,138.74219,−67,765.52344,−68,097.15625,and −68,047.66406,respectively.All five models had a genetic algorithm 341 (GA341) score of 1.The second model (MOD2) with a DOPE score of −68,138.74219 was selected as the best structure among the EasyModeller 4.0-generated models (Figure 2a).

Structure Prediction
Using I-TASSER-Another modeling platform, I-TASSER, was employed to predict reasonable models of the complete menin structure.I-TASSER is a software that predicts protein structures based on structural templates from the protein database [27,30].I-TASSER [27][28][29][30] also predicted five models for the complete structure of menin using PDB ID 3U84 as a template [61].I-TASSER uses a template-based method to develop protein structures and a program called SPICKER to refine the structural models.Briefly, 3U84 is a menin structure solved at a resolution of 2.5 Å.Five models were predicted using I-TASSER with C-scores of −0.5, −1.2, −0.89, −3.51, and −3.27.The protein structure with the highest C-score of −0.5 (ITAS1) was selected as the most reasonable I-TASSER model for menin (Figure 2b) [30].

Protein Model Quality Assessment-Aligning both MOD2 and ITAS1 using
PyMOL yielded an RMSD of 0.545 Å, implying that the two structures are not structurally diverse.The structural and stereochemical qualities of ITAS1 and MOD2 were analyzed using SAVES v6.0.ITAS1 had an ERRAT overall quality factor of 88.9447, VERIFY score of 82.44%, and six PROCHECK errors, two warnings, and one pass.The ERRAT plot of ITAS1 demonstrated that amino acid residues 70, 127-130, 133-135, and 202-207 were the most erroneous sections of the protein.For MOD2, the ERRAT score was 70.5479, the VERIFY score was 80.65%, and it had three PROCHECK errors, three warnings, and three passes.
The structured parts of MOD2 and ITAS1 were also analyzed since the disordered regions could influence the quality metrics.It was observed that ITAS1 had an ERRAT overall quality factor of 92.3237 and VERIFY score of 82.06% (pass) while MOD2 had ERRAT and VERIFY scores of 85.1153 and 79.44%, respectively.This showed that the structured regions of ITAS1 had better scores than those of MOD2.ITAS1 was selected as the most reasonable complete menin structure and was used for the study.PROCHECK was employed to compute the Ramachandran plot for ITAS1 to determine the stereochemistry of the protein model based on overall geometry and residue-by-residue geometry [35].The quality of the protein is based on the percentage of residues in the most favorable, additionally allowed, generously allowed, and disallowed regions, which is shown in the Ramachandran plot [62].ITAS1 had 391, 95, 17, and 15 residues representing 75.5, 18.3, 3.3, and 2.9% of residues in the most favored regions, additionally allowed regions, generously allowed regions, and disallowed regions, respectively (Figure 3).

Force Field Selection
The CHARMM36 force field was identified to be the optimal fit after subjecting the menin protein to energy minimization and 20 ns MD simulation.OPLS demonstrated a lower energy of −2,784,128.25 kJ/mol in 145 steps while CHARMM36 had a potential energy of −2,706,664.50 kJ/mol in 1090 steps (Supplementary Material Figure S1).However, after the 20 ns MD simulation, the OPLS entire menin structure had an average RMSD of 0.511 ± 0.127 nm while the CHARMM36 entire structure had an average RMSD of 0.382 ± 0.58 nm (Figure S2a), implying that the CHARMM36 structure was more stable.Also, the average Rg value of the CHARMM36 structure (2.829 ± 0.019 nm) was lower than that of OPLS (2.874 ± 0.027 nm) [Figure S2c].To ensure that the disordered regions did not adversely influence these values, the analyses were also conducted on only the ordered regions of menin.RMSD and Rg analyses specifically focused on the ordered parts of menin mirrored the trends observed in the analyses of the entire structure (Figure S2b,d).For the ordered regions only, the OPLS and CHARMM36 simulations yielded average RMSDs of 0.47 ± 0.11 and 0.36 ± 0.05 nm, respectively.Correspondingly, the Rg values for the ordered portions were 2.69 ± 0.02 and 2.64 ± 0.01 nm for OPLS and CHARMM36 simulations, respectively.The difference in the Rg and RMSD values between the complete protein and the ordered regions was significantly small, that is ~0.05 nm (0.5 Å) for RMSD and ~0.19 nm (1.9 Å) for Rg.Considering the results from the MD analyses, the CHARMM36 force field was selected for the study.

Preparation of Screening Library
A pre-filtered screening library of compounds was produced, which had 25,196 Chinese natural molecules in total from the TCM database.Compounds that have been shown to inhibit the menin-MLL interaction were also identified.MI-2-2 (PubChem CID: 72201086), developed subsequent to MI-2, a potent inhibitor of the menin (IC 50 of 446 nM), has been reported to exhibit a refined IC 50 that is tenfold lower than that of MI-2, with a value of 46 nM [63] [2,3-d] pyrimidine inhibitor (Code: 0RT)] is shown to inhibit the interaction of menin with MBM1 at an IC 50 of 46 nM [63].Due to its inhibitory activity with the menin-MLL interaction, MI-2-2 was one of the known compounds chosen to compare binding affinities against those in the compound library virtually screened.It was extracted from the 4GQ4 complex and saved in sdf format.The compound MIV-6 (PubChem CID: 71777742) is a hydroxy-and aminomethylpiperidine inhibitor which has IC 50 = 56 nM [8,63].This class of inhibitors directly interferes with MLL-menin and disrupts the protein-protein interaction.In addition, MIV-6 has been shown to induce differentiation and selectively block the proliferation of MLL cells [64].An aminoglycoside antibiotic, tobramycin (PubChem CID: 36294), was identified to have Ki = 59.9 μM.Tobramycin has been shown to competitively occupy the active site of MLL-Fps [64].All ligands used in the study were energy minimized using the UFF under the conjugate gradient algorithm.Prior to virtual screening, the ligands were converted into the pdbqt format using Open Babel [65] in PyRx.

Virtual Screening against ITAS1
AutoDock Vina [20] was used to virtually screen the ligands against the re-modeled menin protein.The program requires a grid box to be pre-calculated, which includes the area of the protein that composes the target binding region.Prior to the docking run, the ability of AutoDock Vina to produce identical poses to experimentally determined conformations was evaluated using LS-Align [49].The best-ranked poses of each of the four known inhibitors were superimposed onto their co-crystallized poses and the RMSD was determined.The best ranked poses were used for validation since the most energetically favorable conformations of the TCM library will be used for further analysis.For MI-2 (0RO), an RMSD of 1.02 Å was obtained after rigid alignment was performed (Figure S3a).For MI-2-2 (0RT), MI-89, and MIV-6, RMSDs of 2.59, 1.33, and 3.58 Å, respectively, were obtained when the best pose was compared to that of the crystallographic or reference structures (Figure S3b-d).An RMSD equal to or less than 2 Å is deemed very good, while an RMSD between 2 Å and 3 Å is considered acceptable.Conversely, an RMSD exceeding 3 Å is considered suboptimal [66].Nevertheless, a cut-off value of RMSD < 2 Å is generally recognized as the most effective threshold for validating accurately posed molecules [45,67].The RMSD values obtained for 0RO and MI-89 show that AutoDock Vina is near-perfect for predicting menin inhibitors, while 0RT's RMSD places AutoDock Vina in the acceptable range.Only one compound's (MIV-6) RMSD value indicates poor pose prediction, suggesting a ~75% chance of AutoDock Vina predicting acceptable or perfect poses for menin binders.
The compound library containing 25,131 compounds in total and three known menin-MLL interaction inhibitors were successfully screened against the energy-minimized protein.The MI-2-2 (0RT) binding site on the 4GQ4 structure was selected for docking.MI-2-2 is located in a hydrophobic pocket and forms a hydrogen bond with Tyr323 [8,11].For the canonical menin sequence, Tyr323 on 4GQ4 maps to Tyr318 on the extracted sequence from UniProt (ID: O00255).
Studies have shown that binding affinities that are −7.0 kcal/mol or lower can distinguish between specific and non-specific protein-ligand bonds [50].Compounds with a binding energy of −9.0 kcal/mol or less were chosen for additional study to further refine the results.In total, 564 compounds passed the criteria.ZINC000103526876 had the highest binding affinity with an energy of −11 kcal/mol (Table 1).Additionally, ZINC000103436976, ZINC000070451007, and ZINC000095913861 had good binding energies of −10.9, −10.7, and −10.7 kcal/mol, respectively (Table 1).The virtual screening results of the known menin-MLL interaction inhibitors were also examined.Tobramycin (CID: 36294) demonstrated a binding energy of −6.7 kcal/mol.MIV-6 (CID: 71777742) was shown to have a binding energy of −6.5 kcal/mol and MI-2-2 (0RT) had a binding energy of −7.8 (Table 1).
The topmost compounds were re-docked against the representative structure of the largest or best cluster after performing clustering analysis on the 20 ns MD system.All the potential lead compounds demonstrated better binding affinity to the menin structure than the known inhibitors (Table 1).The virtual screening results were reviewed using PyMOL 1.7.4.5 [43].
All of the ligands docked deeply into the selected menin binding site.This binding pocket plays a role in the binding of MI-2-2, a potent inhibitor of menin-MLL interaction, and thus may play a pivotal role in developing novel classes of inhibitors.Disrupting the interaction between MLL-FPs and menin will be essential for the treatment of MLL-mediated leukemia.

ADMET Prediction
Compounds with the highest binding affinity after screening with AutoDock Vina were subjected to profiling using SwissADME [51].SwissADME is a program that computes pharmacokinetic properties and drug-like nature of compounds to guide drug discovery [51].The drug-likeness of the ligands with the highest binding affinities was determined based on Lipinski's [69] and Veber's rules.Lipinski's "rule of 5" requires a drug to have a mass less of than 500 g/mol, no more than 10 H-bond acceptors, no more than 5 H-bond donors, and less than 5 octanol-water partition coefficient (log P) values.Veber's rule states that a drug should have a topological polar surface area (TPSA) equal to or less than 140 Å 2 and less than 11 rotatable bonds [70].
Although some natural compounds do not adhere to Lipisnki's or Veber's rules, there are current chemotherapy treatments to suggest that they play an important role in the treatment of human ailments [71,72].Some examples are Taxol, vinblastine, and camptothecin.Additionally, other cancer chemotherapeutic agents come from the marine environment including dolastatins and microbes like bleomycin [71].It is reasonable to suggest that there are natural biological compounds that would be effective in targeting the binding sites of therapeutic targets.For the TCM ligand library, 360 compounds passed the Lipinski rule and 375 compounds passed Veber's rule.In total, 295 compounds passed both the Lipinksi and Veber rules and were shortlisted.OSIRIS DataWarrior 5.0.0 was used to determine the toxicity profiles of the 295 potential lead compounds.Compounds that were high or low in mutagenic and tumorigenic effects were eliminated since the goal of this study was to discover anti-leukemia (anticancer) drugs.In addition, any compounds that had both reproductive and irritancy effects were eliminated.In total, 199 compounds were deemed to be safe and used for further analysis.

Prediction of Biological Activity of Lead Compounds
Prediction of activity spectra of substances (PASS) determines the properties of biologically active substances [49].The top 10 compounds, ZINC000103526876, ZINC000095913861, ZINC000095912705, ZINC000085530497, ZINC000095912718, ZINC000070451048, ZINC00 0085530488, ZINC000095912706, ZINC000103580868, and ZINC000103584057, all having binding energies of −9.9 kcal/mol or lower, were further analyzed using PASS.PASS uses the similarity between a query molecule and structures within its training dataset to predict the activity of the query molecule.The probability of a compound to be active (Pa) determines the likelihood of the query compound belonging to the subset of active compounds, while the probability of it being inactive (Pi) suggests the similarity of the query to the inactive subset of the PASS training dataset [73,74].The probability values (both Pa and Pi) range from 0 to 1, with a higher value suggesting a higher likelihood of activity (Pi) or inactivity (Pi).Generally, compounds with Pa > Pi are deemed likely to demonstrate the particular activity predicted.
Most of the compounds were predicted to have antineoplastic (leukemia, breast, and lung cancer), apoptosis agonist, and chemopreventive properties.The compound ZINC000103526876 with the highest binding affinity (−11 kcal/mol) was predicted as antineoplastic (Pa: 0.864 and Pi: 0.006) and an apoptosis agonist (Pa: 0.655 and Pi: 0.020).ZINC000095913861 with a binding energy of −10.7 kcal/mol was predicted to also have antineoplastic properties (Pa: 0.928 and Pi: 0.005) and to be an apoptosis agonist (Pa: 0.731; Pi: 0.012).Similarly, compounds ZINC000095912705, ZINC000095912706, and ZINC000103584057 were predicted to be antineoplastics (Pa values: 0.927, 0.896, and 0.977; Pi values of 0.005, 0.005, and 0.004, respectively) and apoptosis agonists (Pa values of 0.755, 0.663, and 0.779; Pi values of 0.010, 0.019, and 0.009, respectively).Furthermore, compounds ZINC000085530497, ZINC000095912718, ZINC000085530488, and ZINC000103580868 were predicted to be antineoplastics with Pa values of 0.938, 0.959, 0.931, and 0.812, and corresponding Pi values of 0.004, 0.004, 0.005, and 0.01, respectively.ZINC000085530497 was predicted to be an apoptosis agent (Pa: 0.449 and Pi: 0.052).Antineoplastic agents, also known as anticancer drugs, are a diverse class of drugs.They can be classified as alkylating agents, antimetabolites, and natural products [75].Cancerous cells, including acute leukemia cells, avoid apoptosis in order to proliferate and resist chemotherapy treatments [6].Thus, apoptosis agonists are essential to developing novel treatments for acute leukemias including MLL-mediated leukemia.
Multiple compounds were shown to have chemopreventive properties including ZINC000103526876 (Pa: 0.403; Pi: 0.022) and ZINC000095913861 (Pa: 0.538; Pi: 0.011).Chemopreventive medicines serve to prevent cancer by preventing DNA damage or inhibiting the proliferation of pre-cancerous cells that have DNA damage.[76].Additionally, these agents can induce apoptosis and angiogenesis [77].There has been extensive research on chemopreventive agents that can be derived from fruits and vegetables.It has been shown that the regular consumption of certain fruits and vegetables can lower the risk of obtaining specific cancers [77][78][79].Some of these agents include diallyl sulfide, allicin, capsaicin, and anethol [77].They have also been discovered to reverse the resistance effects of chemotherapy and radiation in patients who are on these treatments [77].Some compounds were shown to have antileukemic properties including ZINC000085 530497 (Pa: 0.699; Pi: 0.005), ZINC000095912718 (Pa: 0.378; Pi: 0.029), ZINC000085530488 (Pa: 0.562; Pi: 0.010), ZINC000103580868 (Pa: 0.564; Pi: 0.010), and ZINC000103584057 (Pa: 0.709; Pi: 0.005).Discovering new antileukemic drugs is essential to treatment efforts for the subset of patients that harbor resistance to other antileukemic therapies and medications.Thus, there is a growing interest in designing novel drugs that have low clinical toxicity [80].
In addition, most of the compounds were predicted to be antineoplastic agents for other cancers including cervical, lung, and breast cancer.ZINC000103526876 was predicted to be an antineoplastic for breast cancer (Pa: 0.574; Pi: 0.013) and lung cancer (Pa: 0.495; Pi: 0.014) while ZINC000095912705 was predicted as an antineoplastic for lung cancer (Pa: 0.658; Pi: 0.007).ZINC000085530497 was predicted to be an antineoplastic for lung cancer (Pa: 0.708;Pi: 0.010) and ZINC000095912718 was predicted to be so for cervical cancer treatment (Pa: 0.636; Pi: 0.004).ZINC000085530488 was also predicted as an antineoplastic for breast cancer (Pa: 0.734; 0.005) and lung cancer (Pa: 0.657; Pi: 0.007).Additionally, ZINC000095912706 and ZINC000103580868 were predicted as antineoplastics for lung cancer (Pa: 0.566 and 0.569; Pi: 0.010 and 0.01, respectively).
The amino acid residue Phe243 formed interactions with all of the compounds, making it possibly an important residue.Additionally, Met283, Cys246, Tyr281, Ala247, Ser160, Asn287, Asp185, Ser183, Tyr328, Asn249, His186, Leu182, Ile248, and Pro250 were involved in the majority of binding in the active site (Table 1 and Figure 5 and Figure S4a-i).Thus, these residues are possibly essential amino acid residues needed for the binding and stability of ligands.Furthermore, compound 36294 was discovered to have the maximum hydrogen bonds among the known inhibitors with six bonds, and ZINC000095912706 formed three hydrogen bonds, making it the compound from the TCM database that has maximum hydrogen bond interactions with menin.

Molecular Dynamics Simulations
Based on the virtual screening results and analysis, ten compounds were subjected to three independent 100 ns MD simulations to analyze the protein-ligand interactions.Compounds ZINC000103526876, ZINC000095913861, ZINC000095912705, ZINC000085530497, ZINC000095912718, ZINC000070451048, ZINC000085530488, ZINC000095912706, ZINC000103580868, and ZINC000103584057 were chosen for further analysis.The unbound menin structure was also subjected to MD simulations.After the MD simulations, the radius of gyration (Rg), root mean square deviation (RMSD), and root mean square fluctuation (RMSF) of the ten compounds and unbound menin were analyzed (Table 2).
After the MD simulations, snapshots were generated in 25 ns intervals (0, 25, 50, 75, and 100 ns) for each system to analyze ligand binding to menin.Throughout the 100 ns simulations, all three MD runs for menin-ligand complexes, excluding 36294, 71777742, and ZINC000070451048, maintained stable binding to the 0RT binding site of menin.In the second MD run of the menin-36294 complex, the ligand exhibited instability at the binding site at 75 and 100 ns.In the case of the menin-71777742 complex (MD run 2), the ligand relocated to a different region at 25 ns and dissociated from the protein at 50 ns.At 75 and 100 ns, 71777742 was observed to bind to menin at a site other than the 0RT binding site.In the second MD run of the menin-ZINC000070451048 complex, the ligand remained tightly bound to menin until the end, where it eventually dissociated from the protein.The average RMSD, Rg, RMSF, and number of hydrogen bonds were calculated using values from all three MD runs, except for simulations in which the ligand dissociated (Table 2).
The Rg plots of the unbound menin, three menin-inhibitor complexes, and the ten meninligand complexes were generated to determine their relative stability (Figure 6 and Figure S5).All systems including unbound menin were found to remain stable throughout the course of the 100 ns MD simulations.The Rg values for the unbound menin were shown to be the most stable throughout the 100 ns simulation course for all three independent MD runs with averages of 2.86 ± 0.06 and 2.67 ± 0.03 nm for the complete protein and only structured regions of menin, respectively.Generally, all the menin-ligand complexes except ZINC000095912718 and ZINC000085530488 demonstrated overall Rg average values that were less than those of menin, signifying a more compact fold during binding.The radius of gyration of a protein determines the protein structure's compactness [83].The Rg values of a stable protein will remain similar over time; however, if the protein unfolds, then the values will increase or decrease [83].The disordered regions contributed to higher Rg values for analyses involving the full protein structure.For Rg analyses involving only structured regions of menin, a ~0.2 nm decrease in the full protein's Rg was observed (Table 2).
To determine the stability of the complexes throughout the simulation compared to that of their reference structure, RMSD plots were generated (Figure 7 and Figure S6) [84].The RMSD graphs demonstrate how the protein structures of each of the complexes during simulations differ from the reference protein structure over time [84].For the ten meninligand complexes and the unbound menin structure, the RMSD plots showed similar trends.From 0 to 10 ns, the RMSD values rose for all ten menin-ligand complexes and the unbound protein.The unbound menin structure had averages of 0.54 ± 0.09 and 0.47 ± 0.03 nm when RMSDs were determined for the complete protein and only structured regions, respectively (Table 2).
To identify the menin residues that contribute to protein structure fluctuations, the RMSF plots of the ten menin-ligand complexes and three menin-inhibitor complexes were analyzed.The protein regions that are most flexible were analyzed using the RMSF plots (Figure 8 and Figure S7a,b).The regions in the RMSF plots with lower values represent the amino acids that are possibly involved in binding and catalysis.The thirteen menin-ligand complexes demonstrated a similar degree of fluctuation in the same amino acid residues in all three independent MD runs.Amino acid positions 260-375 demonstrated the least fluctuations across all complexes in the three MD runs.This is not surprising as most of the ligands interacted with several residues in that region, including Leu262, Gln265, Tyr281, Met283, Asn287, Asp290, Leu291, Leu294, Tyr324, Tyr328, Glu364, and Glu368 (Table 1).Regions with relatively high fluctuations were observed at menin residue indexes 55-60, 70-75, 105-110, 130-135, 150-155, 205-225, 255-260, 390-410, 475-540, implying that these regions may not be involved in ligand binding.The missing residues (1, 54-73, 387-398, and 462-547), which mostly appeared to be disordered in the remodeled structure, were observed to be the regions with the highest fluctuations.The average RMSF values of all the residues in each complex were also determined (Table 2).

Binding Energies Involved in Menin-Ligand
Binding-To compute the binding free energies of the ten menin-ligand complexes and the three known inhibitor complexes, the molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) was used (Figure 9 and Table 3).The free energy-based simulation that MM/PBSA provides has become increasingly better than other methods including molecular docking to compute ligand-protein binding affinities [65].The binding free energy calculations determine the affinity to the active site of the protein, which is essential for drug discovery [66].Since the energy values were computed for all three independent MD runs, overall averages of the energy values were determined for each complex.The overall average energy values were calculated using values from all three MD runs, except for simulations in which the ligand dissociated (36294, 71777742, and ZINC000070451048).To measure the consistency of or variability in the simulation results, the standard deviations of energy values from the independent MD runs were determined (Figure 9 and Table 3).For all experiments, known inhibitor 36294 demonstrated the highest binding free energy of −55.4 kJ/mol while compound ZINC000095913861 demonstrated the lowest binding free energy of −131.1 kJ/mol followed by ZINC000103580868 with −129.6 kJ/mol (Table 3).ZINC000095912705 demonstrated the least variability in binding free energy with a standard deviation of 1.2 kJ/mol, followed by ZINC000103584057 with 2.5 kJ/mol, while ZINC000095912718 and MI-2-2 had the highest variabilities with standard deviations of 29.1 and 24.3 kJ/mol, respectively across all three runs (Figure 9 and Table 3).
Electrostatic and van der Waal's forces are believed to play a role in binding free energies, as determined from earlier studies [56,85].Of all the complexes, 36294 demonstrated the highest overall average van der Waal's energy of −117.2 kJ/mol and 71777742 had the lowest van der Waal's energy of −169.3 kJ/mol followed by ZINC000095913861, ZINC000103580868, ZINC000085530497, and ZINC000095912706 with values of −168.8, −168, −162.9, and −160.5 kJ/mol, respectively (Table 3).

Per-Residue Energy
Decomposition-To determine the energy contribution of each amino acid residue in the protein structure, the MM/PBSA method was used.This method may aid in the development of inhibitors that target the active site of the protein [86].A threshold of more than +5 kJ/mol or less than −5 kJ/mol was used and residues with this threshold could be considered critical binding residues [87].Per-residue energy decomposition calculations were performed for each of the protein-ligand complexes (Figure 10 and Figure S8a-l).The results from the per-residue energy decomposition of only the first independent MD run are discussed herein.

Future Outlook and Implications
This study is one of the few studies that attempts to virtually screen small molecules against the menin structure.The re-modeled protein structure used in this study provides an avenue for the additional discovery of drug-like compounds and better understanding of menin-ligand interactions and promotes catalytic site analysis.Natural products have been under-explored; therefore, this study seeks to show evidence of effective natural products as anti-menin molecules.This study accompanies current endeavors to target the menin Diagram depicting the methods used in this study to predict compounds with the ability to inhibit the menin protein.Natural compounds from the TCM database, and three known menin inhibitors from PubChem, were virtually screened against the menin protein.Structures of the remodeled menin protein using (a) EasyModeller (MOD2) and (b) I-TASSER (ITAS1).Yellow-colored protein regions represent missing residues that were fixed during remodeling while the green portions were solved using X-ray crystallography.Ramachandran plot of the selected menin model (ITAS1) obtained via PROCHECK.The percentages of residues in the most favored regions (red colored; labelled "A", "B", and "L"), additionally allowed regions (yellow colored; labelled "a", "b", "l", and "p"), generously allowed regions (pale yellow; labelled "~a", "~b", "~l", and "~p"), and disallowed regions (white) are 75.5, 18.3, 3.3, and 2.9%, respectively.Superimposition of the "menin-inhibitor" complex and "menin-redocked ligand" complex for (a) MI-2 and (b) MI-89.Conserved residue interactions are circled in red.The interaction profiles of the crystallographic complexes are transparent while those of the re-docked complexes are rendered clearer and brighter for enhanced visibility.Ligand is colored purple with black dots, red spoked arcs represent hydrophobic bonds, while green dashed lines represent hydrogen bonds.To map residues of the crystallographic structures to the canonical menin structures, add 5 to the residue number of the experimentally determined structures.Radius of gyration plots of only the structured regions of the unbound menin protein and the menin-ligand complexes generated using GROMACS after 100 ns MD simulation run 1.The unbound protein is colored black while the known inhibitors, CID 71777742, CID 36294 and 0RT, are colored red, green, and blue, respectively.The unbound menin, and menin complexed with CID 71777742, CID 36294, and 0RT are compared with menin-(a) ZINC000070451048, ZINC000103584057, ZINC000085530488, ZINC000085530497, and ZINC000095912705 complexes; and (b) ZINC000095912706, ZINC000095912718, ZINC000095913861, ZINC000103580868, and ZINC000103526876 complexes.Root mean square deviation of only the structured regions of the unbound menin protein and the menin-ligand complexes generated using GROMACS after 100 ns MD simulation run 1.The unbound protein is colored black while the known inhibitors, CID 71777742, CID 36294, and 0RT, are colored red, green and blue, respectively.The unbound menin, and menin complexed with CID 71777742, CID 36294, and 0RT are compared with menin-(a) ZINC000070451048, ZINC000103584057, ZINC000085530488, ZINC000085530497, and ZINC000095912705 complexes; and (b) ZINC000095912706, ZINC000095912718, ZINC000095913861, ZINC000103580868, and ZINC000103526876 complexes.Root mean square fluctuation plot for MD run 1 of the menin-ligand complexes generated using GROMACS after 100 ns simulation.The known inhibitors, CID 71777742, CID 36294 and 0RT, are colored purple, red, and green, respectively.Bar plots representing the overall averages of the binding free energies from the 3 independent MD runs computed using the MM/PBSA method.The error bars are the standard deviations calculated from the three binding free energy values of each meninligand complex.

Table 1 .
Comparative binding energies and profiles of menin residue interactions with the known inhibitors and the potential lead compounds; docking scores against the representative structure from the largest 20 ns MD cluster.

Table 2 .
Average RMSD, Rg, RMSF, and number of H-Bond values for the unbound menin and menin-ligand complexes after 3 independent 100 ns MD runs.Values are presented as "Value ± Standard deviation" (nm).from the average calculation due to an unstable MD run (or ligand dissociation).

Table 3 .
Energetic insights: MM/PBSA calculations of menin-ligand complexes.The energy values are presented in kJ/mol as "Energy value ± Standard deviation".
Computation (Basel).Author manuscript; available in PMC 2024 June 27.*denotesvalues excluded from the average calculation due to an unstable MD run (or ligand dissociation).Computation (Basel).Author manuscript; available in PMC 2024 June 27.