In Silico Targeting of Fascin Protein for Cancer Therapy: Benchmarking, Virtual Screening and Molecular Dynamics Approaches

Fascin is an actin-bundling protein overexpressed in various invasive metastatic carcinomas through promoting cell migration and invasion. Therefore, blocking Fascin binding sites is considered a vital target for antimetastatic drugs. This inspired us to find new Fascin binding site blockers. First, we built an active compound set by collecting reported small molecules binding to Fascin’s binding site 2. Consequently, a high-quality decoys set was generated employing DEKOIS 2.0 protocol to be applied in conducting the benchmarking analysis against the selected Fascin structures. Four docking programs, MOE, AutoDock Vina, VinaXB, and PLANTS were evaluated in the benchmarking study. All tools indicated better-than-random performance reflected by their pROC-AUC values against the Fascin crystal structure (PDB: ID 6I18). Interestingly, PLANTS exhibited the best screening performance and recognized potent actives at early enrichment. Accordingly, PLANTS was utilized in the prospective virtual screening effort for repurposing FDA-approved drugs (DrugBank database) and natural products (NANPDB). Further assessment via molecular dynamics simulations for 100 ns endorsed Remdesivir (DrugBank) and NANPDB3 (NANPDB) as potential binders to Fascin binding site 2. In conclusion, this study delivers a model for implementing a customized DEKOIS 2.0 benchmark set to enhance the VS success rate against new potential targets for cancer therapies.


Introduction
Tumor metastasis is among the main reasons for mortality, accounting for almost 90% of cancer-related deaths [1]. The vital features of metastatic cancer cells are cell invasion and cell migration, involving the reconstruction of the actin cytoskeleton by triggering the formation of protrusive tissue that leads to enhanced motility of tumor cells in various transformed cells, such as lamellipodia, filopodia and invadopodia [2][3][4].
Fascin1 (termed Fascin thereafter) is an actin-bundling protein (F-actin) playing a main role in formation of protrusions of the cell surface by crosslinking actin filaments tightly and making parallel bundles that promote cell migration [5,6]. Therefore, it is Indazole /Cluster 1 MD efforts. Subsequently, we performed a benchmarking analysis for four diverse-in-architecture and popular docking tools, namely: MOE v. 20.19.01, AutoDock Vina v.1.1.2, VinaXB, and PLANTS v.1.2, representing free and commercial packages, to propose a tool with high predictive power against Fascin. Accordingly, the best performing tool was employed for the prospective VS campaign against the Fascin protein utilizing a repurposing strategy of FDA-approved drugs and North African Natural Products. The proposed hits were further validated in silico via MD simulations and respective analyses.

Selection of Fascin Actives for Decoys Generation
As an initial step, all available Fascin inhibitors were collected from the literature, then the bioactive molecules were manually curated. Due to the novelty of the protein target, many compounds were reported to be in a micromolar range of activity. Therefore, we decided to include them all, and only exclude the one with no determined affinity/activity having Kd values > 100 (µ M). Our focus is on those molecules that bind to actinbinding site 2. This ended up with collecting 25 bioactive molecules as shown in Table 1. This low count of known inhibitors for Fascin reflects the needed efforts to enrich the chemical space of its inhibitory activity. Therefore, in the current study, a VS workflow is proposed to enhance the success rate of drug discovery against Fascin. The core scaffolds of the curated active set represent different chemotype classes, namely: Indazole, N-phenylacetamide, pyrazolo [3,4-d]pyrimidinone, isoquinolone, naphthyridone, pyrazolo [4,3c]pyridine, and pyridone [23,24,27]. A summary of relevant data for the active set is listed in Table 1. Indazole /Cluster 1 2-methyl-N-(1-(4-(trifluoromethyl)benzyl)-1Hindazol-3-yl)furan-3-carboxamide 0.2 nd [23] 4-methyl-N-(1-(4-(trifluoromethyl)benzyl)-1Hindazol-3-yl)isoxazole-5-carboxamide 0.19 nd [24] N-Phenylacetamide /Cluster 2 N-(2,4-dichlorophenyl)-N-methylacetamide nd 92 [27] Pyrazolo [3,4- [23] MD efforts. Subsequently, we performed a benchmarking analysis for four diverse-in-architecture and popular docking tools, namely: MOE v. 20.19.01, AutoDock Vina v.1.1.2, VinaXB, and PLANTS v.1.2, representing free and commercial packages, to propose a tool with high predictive power against Fascin. Accordingly, the best performing tool was employed for the prospective VS campaign against the Fascin protein utilizing a repurposing strategy of FDA-approved drugs and North African Natural Products. The proposed hits were further validated in silico via MD simulations and respective analyses.

Selection of Fascin Actives for Decoys Generation
As an initial step, all available Fascin inhibitors were collected from the literature, then the bioactive molecules were manually curated. Due to the novelty of the protein target, many compounds were reported to be in a micromolar range of activity. Therefore, we decided to include them all, and only exclude the one with no determined affinity/activity having Kd values > 100 (µ M). Our focus is on those molecules that bind to actinbinding site 2. This ended up with collecting 25 bioactive molecules as shown in Table 1. This low count of known inhibitors for Fascin reflects the needed efforts to enrich the chemical space of its inhibitory activity. Therefore, in the current study, a VS workflow is proposed to enhance the success rate of drug discovery against Fascin. The core scaffolds of the curated active set represent different chemotype classes, namely: Indazole, N-phenylacetamide, pyrazolo [3,4-d]pyrimidinone, isoquinolone, naphthyridone, pyrazolo [4,3c]pyridine, and pyridone [23,24,27]. A summary of relevant data for the active set is listed in Table 1. Indazole /Cluster 1 2-methyl-N-(1-(4-(trifluoromethyl)benzyl)-1Hindazol-3-yl)furan-3-carboxamide 0.2 nd [23] 4-methyl-N-(1-(4-(trifluoromethyl)benzyl)-1Hindazol-3-yl)isoxazole-5-carboxamide 0.19 nd [24] N-Phenylacetamide /Cluster 2 N-(2,4-dichlorophenyl)-N-methylacetamide nd 92 [27] Pyrazolo [3,4- MD efforts. Subsequently, we performed a benchmarking analysis for four diverse-in-architecture and popular docking tools, namely: MOE v. 20.19.01, AutoDock Vina v.1.1.2, VinaXB, and PLANTS v.1.2, representing free and commercial packages, to propose a tool with high predictive power against Fascin. Accordingly, the best performing tool was employed for the prospective VS campaign against the Fascin protein utilizing a repurposing strategy of FDA-approved drugs and North African Natural Products. The proposed hits were further validated in silico via MD simulations and respective analyses.

Selection of Fascin Actives for Decoys Generation
As an initial step, all available Fascin inhibitors were collected from the literature, then the bioactive molecules were manually curated. Due to the novelty of the protein target, many compounds were reported to be in a micromolar range of activity. Therefore, we decided to include them all, and only exclude the one with no determined affinity/activity having Kd values > 100 (µ M). Our focus is on those molecules that bind to actinbinding site 2. This ended up with collecting 25 bioactive molecules as shown in Table 1. This low count of known inhibitors for Fascin reflects the needed efforts to enrich the chemical space of its inhibitory activity. Therefore, in the current study, a VS workflow is proposed to enhance the success rate of drug discovery against Fascin. The core scaffolds of the curated active set represent different chemotype classes, namely: Indazole, N-phenylacetamide, pyrazolo [3,4-d]pyrimidinone, isoquinolone, naphthyridone, pyrazolo [4,3c]pyridine, and pyridone [23,24,27]. A summary of relevant data for the active set is listed in Table 1.

Selection of Fascin Actives for Decoys Generation
As an initial step, all available Fascin inhibitors were collected from the literature, then the bioactive molecules were manually curated. Due to the novelty of the protein target, many compounds were reported to be in a micromolar range of activity. Therefore, we decided to include them all, and only exclude the one with no determined affinity/activity having Kd values > 100 (µ M). Our focus is on those molecules that bind to actinbinding site 2. This ended up with collecting 25 bioactive molecules as shown in Table 1. This low count of known inhibitors for Fascin reflects the needed efforts to enrich the chemical space of its inhibitory activity. Therefore, in the current study, a VS workflow is proposed to enhance the success rate of drug discovery against Fascin. The core scaffolds of the curated active set represent different chemotype classes, namely: Indazole, N-phenylacetamide, pyrazolo [3,4-d]pyrimidinone, isoquinolone, naphthyridone, pyrazolo [4,3c]pyridine, and pyridone [23,24,27]. A summary of relevant data for the active set is listed in Table 1.

Selection of Representative PDB Structure(s) for Fascin
The structure of human Fascin includes 493-amino acids. Its four consecutive β-trefoil domains are formed by the amino acid residues 8-139, 140-260, 261-381, and 382-493, respectively [18,39]. The actin-binding site 1, formed by residues from the N and C termini, is located between β-trefoil 1 and 4. Ser39 amino acid, a highly conserved residue in this area, can be phosphorylated by protein kinase C (PKC). The actin-binding site 2 in-

Selection of Representative PDB Structure(s) for Fascin
The structure of human Fascin includes 493-amino acids. Its four consecutive β-trefoil domains are formed by the amino acid residues 8-139, 140-260, 261-381, and 382-493, respectively [18,39]. The actin-binding site 1, formed by residues from the N and C termini, is located between β-trefoil 1 and 4. Ser39 amino acid, a highly conserved residue in this area, can be phosphorylated by protein kinase C (PKC). The actin-binding site 2 includes the residues of β-trefoil 1 and 2 lying at the cleft created by them, while actin-binding site 3 is a potential site where β-trefoil 3 is located [18], as shown in Figure 1.
For the selection of a protein structure to be used in the benchmarking study, we searched all Fascin1 structures and downloaded them from the Protein Data Bank (Table S1 in Supplementary Materials). Previous studies revealed that Fascin inhibitors cause a conformational change in Fascin through an induced-fit inhibitory mechanism disrupting the actin-binding sites, and hence impairing its actin-bundling role [24]. Therefore, to account for the protein's different conformations, their structures were superposed showing their differences in the backbones as indicated by their RMSD values (see Figure S1 in Supplementary Materials). We focused especially on the Fascin structures co-crystallized with ligands in their binding site to detect any structural changes occurring during the ligand-protein binding. Consequently, based on analysis, both Fascin structures (PDB ID: 6I18) and (PDB ID: 6I0Z) were selected to represent two main liganded clusters of conformations. This is indicated by their pairwise RMSD values considering the whole protein structure and their pockets, as demonstrated in Figure S2. For the selection of a protein structure to be used in the benchmarking study, we searched all Fascin1 structures and downloaded them from the Protein Data Bank (Table S1 in SM). Previous studies revealed that Fascin inhibitors cause a conformational change in Fascin through an induced-fit inhibitory mechanism disrupting the actin-binding sites, and hence impairing its actin-bundling role [24]. Therefore, to account for the protein's We aim in this study to tackle the virtual screening performance of examples from diverse docking tools, whether publicly available (e.g., AutoDock Vina, VinaXB, and PLANTS), or commercial ones (e.g., MOE). These docking tools represent different architectures in the development of their optimization/search algorithms and scoring functions. For instance, AutoDock Vina is based on the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method for the local optimization and uses its own Vina scoring function [40], while PLANTS employs the Protein-Ligand ANT System algorithm and PLANTS CHEMPLP scorning function [41]. Since the majority of the bioactive compounds for the Fascin protein are halogenated compounds (Table 1), this directed us to utilize a docking tool with a halogen-bonding implementation in the scoring function, such as VinaXB [42], where it was developed based on AutoDock Vina. Moreover, applying a commercial package, we used MOE which employs the London dG scoring function. Nonetheless, the benchmarking analysis can be extended to any other docking tool. Initially, these docking tools were utilized for pose-retrieval docking experiments for the co-crystal ligand (PDB: 6I18). Interestingly, they reflected acceptable results with RMSD values < 2 Å, as shown in Figure 1B.

Benchmarking
To provide valid benchmarking sets for the effective performance of the structurebased VS, prerequisites should be met. First, a set of chosen and well-described ligands known as actives should be assembled. Second, the selection of decoys structures should depend on well-established criteria (e.g., DEKOIS 2.0 protocol [37,43,44]). Finally, to represent the ligand binding site well, a respective 3D structure is required. The eligible targets to produce benchmark sets are constrained by these fundamental requirements.
The main aim was to identify the best docking program that can distinguish between the active ligands and the created decoys efficiently. Accordingly, the effectiveness of the corresponding screening rises with an increasing active number recognized in the list of best-scored compounds.
Consequently, we created a challenging set of decoys by using the DEKOIS 2.0 protocol based on the collected bioactives of Fascin from the literature. Then, the benchmarking study was conducted using four popular docking tools, namely, MOE, AutoDock Vina, VinaXB, and PLANTS for evaluating their screening performance against the Fascin structure.
The chemotype enrichment was analyzed with a "p-ROC-chemotype" [45,46] plot ( Figure 3) for the benchmarking of the Fascin structure (PDB ID: 6I18) using a PLANTS docking program. We obtained different chemotype classes (7 clusters) based on their scaffolds. Generally, such clusters' numbers reflect the lack of known small-molecule ligands that emphasizes the need of discovering more various small molecules working as Fascin inhibitors. The bioactivity data of the actives are represented by level of activity (LOA) values ranging from 10 5 to 10 −8 M and recorded as IC 50 or Kd as a type of data (TOD) ( Figure 3A).
The pROC-Chemotype plot revealed that PLANTS can detect potent binder ligands at early enrichment, as shown in ( Figure 3A). For example, the two best-ranked molecules (with docking ranks 1 and 2) have bioactivity ranks 4 and 10, and Kd values of 250 and 1200 nM, respectively, Figure 3A. Both molecules exhibited interactions with the following residues; Phe216, Trp101, and Ala59 for compound rank 1, in addition to interactions with Phe216 and Phe14 for compound rank 2 ( Figure S3) reproducing the key interactions of the reported ligand of Fascin (PDB ID: 6I18) as shown in (Figure 3B,C). It is noteworthy that this co-crystal ligand (BDP-13176) is involved in the active set with a docking rank of 12, and bioactivity rank of 1.
that the four docking tools exhibited significantly better performance for Fascin (PDB ID: 6I18), compared to Fascin (PDB ID: 6I0Z). Furthermore, all assessed tools showed betterthan-random performance against Fascin (PDB ID: 6I18), as shown in Figure 2A  Moreover, only active molecules were recognized at 1% of the score-ordered molecules list, and none of the decoys were enriched, yielding an Enrichment Factor (EF 1%) of 31 for PLANTS compared to 23.31, 3.88, and 3.88 for MOE, Vina, and VinaXB, respectively. This indicates the potential predictive capability of PLANTS to identify active compounds 31 times at early enrichment (database cutoff 1%) more frequently than random performance. Figure 3D displays the docking fitness distribution (fitness = docking score multiplied by −1) of the active compounds. The range of docking score started from −117.6 (best score) to −66.42 (worst score), reported as fitness values of 117.6 to 66.42. In addition, the compounds of cluster 5 are in the superior region of fitness values (e.g., fitness < 106).
Unlike the Fascin (PDB ID: 6I18), the screening performance of Fascin (PDB ID: 6I0Z) using PLANTS docking did not enrich any active compound at 1% of the database as shown in Figure 2B and its pROC-Chemotype plot ( Figure S4). This observation is consistent with other assessed docking tools emphasizing the target-dependent nature of the benchmarking process. These results highlight that the binding site conformation of PDB: 6I18 is welladapted for recognizing small molecule inhibitors and well-suited for virtual screening efforts. Therefore, the results encouraged us to apply PLANTS in the prospective VS against the Fascin structure (PDB ID: 6I18).  Figure 3D displays the docking fitness distribution (fitness = docking score multiplied by −1) of the active compounds. The range of docking score started from −117.6 (best

Prospective Virtual Screening
According to the promising benchmarking analysis results, we employed PLANTS in the virtual screening process of the FDA-approved drugs from the DrugBank database (1469 compounds) [47] besides the natural products from the Northern African Natural Products Database (3912 compounds) [48] against the Fascin structure (PDB ID: 6I18). The VS results of the best enriched 1% of the FDA-approved drugs and NANPDB are displayed in Table 2 and (Table S2), respectively. Regarding the molecules' binding poses, they all showed comparable orientations and interactions with the key amino acids of the binding site 2 to the co-crystal ligand of PDB ID: 6I18, as seen in Figure 1. We chose to elucidate the binding interactions of the best ranked molecules which exhibited better localization of the Fascin binding site 2. Visualizing the DrugBank results, they showed that Remdesivir, Lapatinib, and Fexofenadine, respectively appeared to be the top-scored compounds meeting the previous criteria. Figure 4A displays the docking pose of Remdesivir in the Fascin pocket (PDB ID: 6I18). Remdesivir is a nucleoside analog inhibiting RNA-dependent RNA polymerase. It is used to treat viral infections such as severe acute respiratory syndrome coronavirus 2 [49]. Its proposed binding pose in the Fascin pocket displayed H-bonding interactions via its dihydroxy groups with the side chains of Leu214 and Phe216, Figure 4B. We used the chemical structure of Remdesivir without further bioactivation.
Lapatinib is a tyrosine kinase inhibitor working as an antineoplastic agent that is used for treating patients with aggressive or metastatic HER-negative breast cancer who treated with previous chemotherapies [50]. Its postulated docking pose ( Figure S5) made H-bond interactions through its methylsulfonyl group with Arg217 and Val248 side chains as well as H-pi interaction via its 3-flurophenyl group with Phe14.
Fexofenadine is a second-generation antihistamine that is considered a selective H1receptor antagonist indicated for chronic idiopathic urticarial and allergic rhinitis treat-ment [51]. The binding pose of Fexofenadine exhibited H-bond interactions with Leu214 and Phe216 side chains, besides the H-pi interaction with Gln50 ( Figure S6).
Visualizing the DrugBank results, they showed that Remdesivir, Lapatinib, and Fexofenadine, respectively appeared to be the top-scored compounds meeting the previous criteria. Figure 4A displays the docking pose of Remdesivir in the Fascin pocket (PDB ID: 6I18). Remdesivir is a nucleoside analog inhibiting RNA-dependent RNA polymerase. It is used to treat viral infections such as severe acute respiratory syndrome coronavirus 2 [49]. Its proposed binding pose in the Fascin pocket displayed H-bonding interactions via its dihydroxy groups with the side chains of Leu214 and Phe216, Figure 4B. We used the chemical structure of Remdesivir without further bioactivation. Lapatinib is a tyrosine kinase inhibitor working as an antineoplastic agent that is used for treating patients with aggressive or metastatic HER-negative breast cancer who treated with previous chemotherapies [50]. Its postulated docking pose ( Figure S5) made H-bond interactions through its methylsulfonyl group with Arg217 and Val248 side chains as well as H-pi interaction via its 3-flurophenyl group with Phe14.
Fexofenadine is a second-generation antihistamine that is considered a selective H1receptor antagonist indicated for chronic idiopathic urticarial and allergic rhinitis treatment [51]. The binding pose of Fexofenadine exhibited H-bond interactions with Leu214 and Phe216 side chains, besides the H-pi interaction with Gln50 ( Figure S6).
Regarding the VS of the natural products from NANPDB, compounds CP3451, CP3270, and CP3685 (see Table S2), termed NANPDB1-3 thereafter, displayed the best docking pose and ligands' interactions occupying the binding site 2 of Fascin protein, effectively. Compound NANPDB1 is 2S,3R-4E,8E-2-(octadecanoylamino)-octadeca-4,8diene-1,3-diol. It is ceramide that was extracted and isolated from the Egyptian Red Sea soft coral Heteroxenia ghardaqensis. The extracted compound exhibited a moderate anticancer effect on human Hep-G2 cancer cell lines working as a growth inhibitor [52]. NANPDB2 is 1-O-linoleoyl-3-O-beta-D-galactopyranosyl-syn-glycerol isolated from the aerial parts extract of the Egyptian plant Sida spinosa L., Malvaceae. The plant was reported to be used in treating nervous, urinary, and cardiac diseases [53]. NANPDB3 is Quercetin-3-O-beta-(6''-galloylgalactoside) which was isolated from the Egyptian Sanguisorba minor plant. The plant's extract is used in folk medicine for its hypoglycaemic activity [54]. These natural products are expected to be used as adjuvant therapies or as Regarding the VS of the natural products from NANPDB, compounds CP3451, CP3270, and CP3685 (see Table S2), termed NANPDB1-3 thereafter, displayed the best docking pose and ligands' interactions occupying the binding site 2 of Fascin protein, effectively. Compound NANPDB1 is 2S,3R-4E,8E-2-(octadecanoylamino)-octadeca-4,8-diene-1,3-diol. It is ceramide that was extracted and isolated from the Egyptian Red Sea soft coral Heteroxenia ghardaqensis. The extracted compound exhibited a moderate anti-cancer effect on human Hep-G2 cancer cell lines working as a growth inhibitor [52]. NANPDB2 is 1-O-linoleoyl-3-Obeta-D-galactopyranosyl-syn-glycerol isolated from the aerial parts extract of the Egyptian plant Sida spinosa L., Malvaceae. The plant was reported to be used in treating nervous, urinary, and cardiac diseases [53]. NANPDB3 is Quercetin-3-O-beta-(6"-galloylgalactoside) which was isolated from the Egyptian Sanguisorba minor plant. The plant's extract is used in folk medicine for its hypoglycaemic activity [54]. These natural products are expected to be used as adjuvant therapies or as supplements with cancer therapies. The docking poses and ligands' interactions of the best-scored compounds CP3451 (NANPDB1), CP3270 (NANPDB2), CP3756, CP3407, and CP3831 are displayed in Figures S7-S11, respectively. NANPDB1 exhibited H-bond interactions with Ile93 while NANPDB2 displayed H-bond interactions with Glu215, Arg217, and Val248. The proposed binding pose of NANPDB3 made H-bond interactions with Glu215, Ala58, Ile93, and Leu214, as shown in Figure 5.

Molecular Dynamics Simulation
The three top-enriched ligands in DrugBank (Remdesivir, Lapatinib, and Fexofenadine) and compounds (NANPDB1-3) from the natural products database were subjected to 100 ns MD simulations to evaluate their stability inside the Fascin binding site. Seven MD runs were conducted, including two extra runs for the holoprotein and unliganded protein as a reference to account for the dynamics of the protein and its co-crystallized ligand. Figure 6 displays the analysis of the protein's radius of gyration (RoG), root mean square deviation (RMSD), and root mean square fluctuation (RMSF). Radius of gyration is a measurement of how compact the protein structure was throughout the simulation period. The RoG fluctuation of the protein complexes is within 6 Å, with Remdesivir and NANPDB2 having the lowest and the highest RoG values, respectively, at the end of the 100 ns simulation as shown in Figure 6. For instance, the reference holoprotein exhibited minor fluctuations until 75 ns, then a deviation can be observed afterward from 75 ns to the end of the simulation within 56 Å to 58 Å. While the unliganded system displayed a smooth increase of RoG from 54 Å to 56 Å with no visible dramatic fluctuations on its path. Unlike the behavior of RoG of the reference simulations, NANPDB1 (purple) and NANPDB2 (blue) showed obvious high fluctuations after 30 ns and 80 ns of simulation time, respectively, compared to other ligand-complex systems. Nonetheless, the overall RoG behavior of all complexes indicates successful protein simulation during the simulation course and the absence of major conformational changes or unfolding processes during the simulation.

Molecular Dynamics Simulation
The three top-enriched ligands in DrugBank (Remdesivir, Lapatinib, and Fexofenadine) and compounds (NANPDB1-3) from the natural products database were subjected to 100 ns MD simulations to evaluate their stability inside the Fascin binding site. Seven MD runs were conducted, including two extra runs for the holoprotein and unliganded protein as a reference to account for the dynamics of the protein and its co-crystallized ligand. Figure 6 displays the analysis of the protein's radius of gyration (RoG), root mean square deviation (RMSD), and root mean square fluctuation (RMSF). Radius of gyration is a measurement of how compact the protein structure was throughout the simulation period. The RoG fluctuation of the protein complexes is within 6 Å , with Remdesivir and NANPDB2 having the lowest and the highest RoG values, respectively, at the end of the 100 ns simulation as shown in Figure 6. For instance, the reference holoprotein exhibited minor fluctuations until 75 ns, then a deviation can be observed afterward from 75 ns to the end of the simulation within 56 Å to 58 Å . While the unliganded system displayed a smooth increase of RoG from 54 Å to 56 Å with no visible dramatic fluctuations on its path. Unlike the behavior of RoG of the reference simulations, NANPDB1 (purple) and NANPDB2 (blue) showed obvious high fluctuations after 30 ns and 80 ns of simulation time, respectively, compared to other ligand-complex systems. Nonetheless, the overall RoG behavior of all complexes indicates successful protein simulation during the simulation course and the absence of major conformational changes or unfolding processes during the simulation.  2D as (A,B), respectively. The color scheme is the same as in Figure 4. Moreover, the protein dynamics' stability was assessed via the RMSD, which was calculated on alpha carbon atoms. For the unliganded protein, low change for the RMSD values can be observed until 50 ns of simulation; afterward, a higher change took place from 1.5 Å to 3 Å, while the holoprotein complex displayed a lower change throughout the simulation time with low fluctuation after 85 ns. This suggests that the co-crystal ligand can better stabilize the protein and produce lower fluctuations for the backbone protein atoms. Interestingly, Remdesivir (red), NANPDB1 (purple), NANPDB2 (blue), and NANPDB3 (cyan) displayed lower changes in the RMSD values compared to Lapatinib (orange) and Fexofenadine (yellow). For instance, the Remdesivir complex system displayed RMSD values around 1.5 Å from 0 to 65 ns, then rising to 2.5 Å at 70 ns to return around 1.5 Å from 75 to 95 ns, with another cycle of rising to 2.5 Å at 95 ns and coming back to 1.5 Å at 100 ns. Unlike the low RMSD value fluctuations of Remdesivir, Fexofenadine showed increased values from 2 Å to 4 Å from 0 to 55 ns simulations, while rapid increase and fluctuations could be seen around 2 Å from 55 to 100 ns. Initially, these observations indicate the superior ability of Remdesivir from the FDA-approved drugs to stabilize protein backbone compared to Lapatinib and Fexofenadine, and likewise, for the three natural products.
RMSF measures the per residue conformational changes throughout the simulation time. We observed common flexibility patterns in all systems in some regions, mainly in residue numbers 30-50 with RMSF values (>3 Å) and 130-150 (>2 Å). This observed flexibility is attributable to the structural loop regions for the Fascin protein. The flexibility of these regions was significantly reduced in systems with Remdesivir (red), NANPDB2 (blue), and NANPDB3 (cyan), especially compared to the unliganded system (green) and the co-crystal system (black). For instance, the region of residue number 30-50 exhibited >5 Å fluctuations for both the unliganded and co-crystal systems, while the presence of Remdesivir (red), NANPDB2 (blue), and NANPDB3 (cyan) as ligands reduced such fluctuations dramatically (<4 Å). On the other hand, other systems, such as Fexofenadine (yellow) displayed high flexibility in the majority of the regions. Remarkably, complex systems with Remdesivir (red), NANPDB2 (blue), and NANPDB3 (cyan) exhibited low RMSF values (<2 Å) at the binding site amino acids revealing the minimal conformational changes for these residues and reflecting a promising stabilization effect of these ligands to the binding site. Moreover, the protein dynamics' stability was assessed via the RMSD, which was calculated on alpha carbon atoms. For the unliganded protein, low change for the RMSD values can be observed until 50 ns of simulation; afterward, a higher change took place from 1.5 Å to 3 Å , while the holoprotein complex displayed a lower change throughout the simulation time with low fluctuation after 85 ns. This suggests that the co-crystal ligand can better stabilize the protein and produce lower fluctuations for the backbone protein atoms. Interestingly, Remdesivir (red), NANPDB1 (purple), NANPDB2 (blue), and NANPDB3 (cyan) displayed lower changes in the RMSD values compared to Lapatinib (orange) and Fexofenadine (yellow). For instance, the Remdesivir complex system displayed RMSD values around 1.5 Å from 0 to 65 ns, then rising to 2.5 Å at 70 ns to return around 1.5 Å from 75 to 95 ns, with another cycle of rising to 2.5 Å at 95 ns and coming back to 1.5 Å at 100 ns. Unlike the low RMSD value fluctuations of Remdesivir, Fexofenadine showed increased values from 2 Å to 4 Å from 0 to 55 ns simulations, while rapid increase and fluctuations could be seen around 2 Å from 55 to 100 ns. Initially, these observations indicate the superior ability of Remdesivir from the FDA-approved drugs to  Figure 7 shows the analysis of RMSD for the heavy atoms of the ligand poses and their hydrogen bond count with the protein. As a reference, the co-crystal ligand showed stable RMSD behavior around 3 Å throughout the whole simulation reflecting good stability of the co-crystal pose. According to RMSD measurements for the tested poses, Remdesivir (red) displayed the lowest RMSD values among the other poses of the FDA-approved drugs revealing its highest stability within the Fascin binding site. Remdesivir showed minor fluctuations from 3 Å to 5 Å at 0 ns to 5 ns, then a steady behavior around 5 Å ± 1 Å from 5 ns to 100 ns. On the other hand, Fexofenadine (yellow) exhibited a balanced behavior around 3 Å from the start of the simulation to 50 ns, then a rapid increase in RMSD values to 12-18 Å after 50 ns until the end of the simulation, reflecting high changes in the pose coordinates in the binding site. From the natural product ligands, NANPDB3 (cyan) showed the best stability compared to NANPDB1 (purple) and NANPDB2 (blue). The NANPDB3 pose exhibited an increase in RMSD values from 3 Å to 6 Å during the 0 ns to 20 ns, while it showed a constant performance around the RMSD value of 6 Å afterward throughout the simulation from 20 ns to 100 ns. Unlike NANPDB3, NANPDB2 displayed high RMSD fluctuations, especially after 20 ns of the simulation time where the RMSD values increased towards 14 Å at 39-43 ns with some greater fluctuations until the end of the 100 ns simulation.
According to the generated hydrogen bonds number between each ligand and its relevant protein (Figure 7), NANPDB3 (cyan) exhibited the highest number of hydrogen bonds revealing its potent binding compared to other molecules, followed by NANPDB2 (blue) and Fexofenadine (yellow) showing their moderately strong affinity. Regarding Lapatinib and NANPDB1, they exhibited the least number of hydrogen bonds formed with their proteins indicating their weak affinity. Although Fexofenadine showed a high number of hydrogen bonds at the end of the simulation, we propose that its pose underwent a major positional change inside the binding site provided by the fact of its RMSD plot (Figure 7). According to the generated hydrogen bonds number between each ligand and its relevant protein (Figure 7), NANPDB3 (cyan) exhibited the highest number of hydrogen bonds revealing its potent binding compared to other molecules, followed by NANPDB2 (blue) and Fexofenadine (yellow) showing their moderately strong affinity. Regarding Lapatinib and NANPDB1, they exhibited the least number of hydrogen bonds formed with their proteins indicating their weak affinity. Although Fexofenadine showed a high number of hydrogen bonds at the end of the simulation, we propose that its pose underwent a major positional change inside the binding site provided by the fact of its RMSD plot (Figure 7).
To further elucidate the relative positioning of the proposed ligands to the key binding site residues, we monitored their relative distances (based on the center of mass) to such residues of the Fascin binding site during the 100 ns simulation course. The selection of residues to be considered in this assessment was based on their role in the ligand's binding to the Fascin binding site 2. The reported Fascin crystal structures revealed that the ligands bind in an induced pocket with a hydrophobic "hook" in the cleft between the β-trefoil domains 1 and 2 and extend towards the protein surface, causing a significant To further elucidate the relative positioning of the proposed ligands to the key binding site residues, we monitored their relative distances (based on the center of mass) to such residues of the Fascin binding site during the 100 ns simulation course. The selection of residues to be considered in this assessment was based on their role in the ligand's binding to the Fascin binding site 2. The reported Fascin crystal structures revealed that the ligands bind in an induced pocket with a hydrophobic "hook" in the cleft between the β-trefoil domains 1 and 2 and extend towards the protein surface, causing a significant conformational change in domain 1 and making interactions with specific domain 2 residues [24,27]. Although the inhibition mechanism of the Fascin bundling activity is still unclear, it is noteworthy that proposed Fascin's actin-binding sites 1 and 2 include domain 1 crossing a domain boundary. Moreover, both sites would be deformed by the conformational change induced by the bound ligand, thus disrupting actin-binding [24,27]. Accordingly, Phe14, Ala59, Ile93, and Trp101 residues were selected from domain 1 besides Leu214, Glu215, and Phe216 from domain 2 for the distance monitoring to the proposed ligands. Visualizing DrugBank candidates, Remdesivir exhibited the best behavior compared to Lapatinib and Fexofenadine indicated by its distance behavior to the selected residues, as observed in Figure 8. For instance, Remdesivir displayed a distance range of~(1-1.6), (0.9-1), (0.5-0.6), 0.75, (0.9-1.1), (0.6-0.7), and (0.6-1) nm to the center of mass of Phe14, Ala59, Ile93, Trp101, Leu214, Glu215, and Phe216, respectively. Such behavior with minor distance range fluctuations, especially after 20 ns, best mimicked the behavior of the co-crystallized ligand (Figure 8), reflecting an acceptable stability of Remdesivir during the simulation course. However, the distance values of Lapatinib and Fexofenadine indicated higher fluctuations per residue compared to the co-crystallized ligand, as seen in Figure 8. Like Remdesivir, NANPDB3 displayed the best distance behavior compared to NANPDB1 and NANPDB2, especially after 20 ns. NANPDB3 displayed distance ranges of~(1.25-1. Focusing on Remdesivir and NANPDB3, we utilized the principal component analysis (PCA) to analyze the conformational sampling of the Remdesivir-, NANPDB3-and co-crystallized ligand-Fascin complex systems, as well as the unliganded protein system in the simulated subspace via examining their dominant modes of motion. The covariance Based on the above-mentioned results, both Remdesivir (from DrugBank) and NAN-PDB3 (from NANPDB) showed high stability and were recommended to be the best potential binders to Fascin actin-binding site 2.
Focusing on Remdesivir and NANPDB3, we utilized the principal component analysis (PCA) to analyze the conformational sampling of the Remdesivir-, NANPDB3-and cocrystallized ligand-Fascin complex systems, as well as the unliganded protein system in the simulated subspace via examining their dominant modes of motion. The covariance matrix of atomic fluctuations was diagonalized for predicting the eigen values. The first few eigen vectors play a critical role in the motions of the protein. The first 2 eigen vectors have a higher eigen value for all four Fascin systems suggesting-to a certain extent-their comparable behavior for the whole protein motion. To expose the ligand influences on the conformational heterogeneity of Fascin, associated free energy landscapes (FEL) were determined as a function of the top two principal components (PC1 and PC2), as exemplified in Figure 9. FEL can be employed to effectively explain conformational redistributions prompted by binding events [55][56][57]. The color bar represents the free energy value in kcal mol −1 . The color ranges from red to yellow to blue spots indicate the energy minima and energetically favored protein conformations to more unfavorable high-energy conformations. Figure 9 demonstrates the relative conformational changes of the protein backbone of the four simulated systems. The deeper color (towards the red color) in the plot reveals lower-energy conformational metastable states. Remdesivir-and NANPDB3-Fascin complex systems populated by a wide energy basin suggesting a range of metastable states and ensemble of low energy conformations of the simulated subspace during the 100 ns simulation ( Figure 9A,B). On the other hand, the simulated co-crystal system of Fascin The color bar represents the free energy value in kcal mol −1 . The color ranges from red to yellow to blue spots indicate the energy minima and energetically favored protein conformations to more unfavorable high-energy conformations. Figure 9 demonstrates the relative conformational changes of the protein backbone of the four simulated systems. The deeper color (towards the red color) in the plot reveals lower-energy conformational metastable states. Remdesivir-and NANPDB3-Fascin complex systems populated by a wide energy basin suggesting a range of metastable states and ensemble of low energy conformations of the simulated subspace during the 100 ns simulation ( Figure 9A,B). On the other hand, the simulated co-crystal system of Fascin visits two separate energy basins; one represents the global minimum of the simulated subspace, while the other is quite narrow and separated by some conformations with relatively low energy from the main basin ( Figure 9C). This reflects the presence of one main ensemble of low energy conformations of flexible and low energy conformations during 100 ns simulation. Remarkably, the unliganded protein system ( Figure 9D) clearly displays two distinct energy basins with low incidence for visiting the global minimum of the simulated subspace (few red dots). This reflects that the liganded complex systems ( Figure 9A-C) would drive the simulated subspace into a higher incidence of lower energy ensemble of conformations compared to the unliganded system. Thus, these results clearly highlight that Remdesivir and NANPDB3 binding to Fascin can alter the protein conformational subspace towards low-energy conformations, and therefore, modulate its function.
Overall, the results of the MD simulations endorse the high potential and stable binding of Remdesivir and NANPDB3 to Fascin as an outcome of a benchmarking-guided virtual screening effort.
To provide more insights on both poses of Remdesivir and NANPDB3 throughout the MD trajectory, we computed the minimum distance between the interacting atoms of the ligand and protein residues, as shown in Figure 10. Initially, Remdesivir showed H-bonding interactions with Leu214 and Phe216, as well as hydrophobic interaction with Val134. These favorable interactions are reflected in distances of~0.3 to 0.4 nm (i.e., 3 Å -4 Å) between Remdesivir's O1 and O2 atoms of its sugar-like moiety to the O atoms of the backbone of Leu214 and Phe216 (more details are in Figure 10). This interacting pose appeared to be consistent during the beginning of the simulation time (from 0 to 5 ns), while the distance graph proposes that some dynamics affected the interaction pattern with a new interacting pose at 10 ns and remained consistent throughout the whole 100 ns MD simulation. During this transformation, a new interacting residue (Arg224 -yellow line in Figure 10A) approached to form H-bonding interactions with Remdesivir (distance~0.3 to 0.4 nm from atom O1) stabilizing its new pose from 10 ns to 100 ns of the simulation time. Interestingly, Leu214, Phe216 and Val134 remained at constant distance ranges from 0.5 nm to 0.8 nm. Inspecting the relative position of different poses of Remdesivir, we conclude that there are two main clusters of Remdesivir poses revealed via different time snapshots. The first cluster of poses can be visualized during simulation time of 0-5 ns, while the second cluster comprises most of the poses, during 5 ns to 100 ns of the simulation time (see the snapshots in Figure 10). The main differences between the two cluster of poses are attributable to minor rotation of the heterocyclic system of pyrrolotriazine ring and the bonded sugar-like part to accommodate favorable binding and H-bonding interactions in the binding site. These observations propose a stable binding of Remdesivir at the near proximity of the key Fascin residues and hence potentially modulate its function.
Like Remdesivir, NANPDB3 pose initially displayed H-bonding interactions with Leu214, and additionally with Glu215, Ile93 and Ala58. The key atoms for this H-bonding network are the O atoms for the sugar part, namely: O11, O12 and O13 and the O atoms of the backbone of the mentioned residues and the side chain oxygen (OE2) of Glu215 (see Figure 10B). Such favorable interactions are reflected in distances of~0.3 to 0.4 nm between the respective atoms, as displayed in Figure 10B. The interacting NANPDB3 pose exhibited some fluctuations from 0 to 20 ns time with a distance range to the respective residue atoms from 0.3 to 1.75 nm ( Figure 10B). A new stable pose is predominantly formed from 20 ns to the end of 100 ns simulation time creating new balanced distances to the side chain atoms of NH1 and NH2 for Arg217 and Arg224, respectively, indicating stable H-bonding interactions. Taking different time snapshots of NANPDB3 poses ( Figure 10B) revealed that the new pose (time: 20-100 ns) is evolved due to a major flip of the trihydroxy phenyl group of NANPDB3 from its early poses (time: 50 ps and 400 ps). This flip is tolerable since the trihydroxy phenyl group is mostly solvent-exposed and possess greater degrees of freedom. Overall, like Remdesivir, these observations suggest a stable binding of NANPDB3 after 20 ns at the vicinity of the key Fascin residues and therefore hypothetically able to modulate its function. Generally, these observations of Figure 10 are highly consistent with the observations of Figure 8.

Preparation of Protein Structures
Crystal structures (PDB ID. 6I0Z) and (PDB ID. 6I18) of Fascin, adopted for this study, were isolated from homo sapiens and expressed in Escherichia coli. The structures are co-crystallized in complex with~{N}-(2,4-dichlorophenyl)-~{N}-methyl-ethanamide and 5-[(3,4-dichlorophenyl)methyl]-4-oxidanylidene-1-piperidin-4-yl-~{N}-pyridin-4-ylpyrazolo [4,3-c]pyridine-7-carboxamide, respectively. The X-ray crystal structures were downloaded from the Protein Data Bank where they displayed Resolutions of 1.77 Å and 1.49 Å, respectively. Molecular Operating Environment (MOE) v.2019.01 [58] was utilized to prepare the protein structures before the docking processes, adopting the AMBER10:EHT force field. Any redundant chains with unessential ions, crystallization molecules, and water molecules were discarded. Then, the MOE function "Quickprep" was employed at default settings. These settings include applying the function "Protonate 3D" for improving the H-bonding network and permitting ASN/GLN/HIS to flip throughout protonation. In addition, the refinement of the ligand and binding site atoms was conducted by minimizing the energy to 0.1 kcal/mol/A RMS gradient, and the receptor atoms were restrained by applying a force constant (strength = 10). The remaining atoms of the receptor lying outside the binding pocket were maintained the same. The outcome of the previous settings did not display a significant difference regarding the binding site/ligand coordinates. The prepared protein structures were saved as mol2 files for the docking efforts. The benchmarking experiments were conducted on Fascin (PDB ID: 6I0Z) and Fascin (PDB ID: 6I18), while Fascin (PDB ID: 6I18) was chosen for VS of the DrugBank and NANPDB molecules.
MOE v.2019.01 was used for conducting the protein superpositions.

Preparation of Small Molecules of DEKOIS 2.0 Benchmark Set, DrugBank FDA-Approved Drugs, and NANPDB Molecules
The DEKOIS 2.0 [37] protocol was employed on 25 Fascin bioactives, collected from literature [23,24,27], to create 750 challenging decoys (1:30 ratio). After that, preparation of all molecules was performed by MOE v.2019.01 using 'Molecule wash'. This setting was utilized to produce valid protonation states through protonating strong bases and deprotonating strong acids (if needed). In addition, the minimizing of compounds energy was employed via forcefield "Amber: 10EHT" at a 0.01 RMSD gradient. The remaining parameters were maintained at default settings. One protonation state was made at pH 7.0 and one conformer was retained for each molecule. Moreover, the stereo configuration of all actives, decoys, DrugBank, and NANPDB compounds was kept. The prepared compounds were saved as SD files which were transformed and split into PDBQT files via OpenBabel [59] for AutoDock Vina and VinaXB docking experiments. For docking experiments via PLANTS, the SD files were converted into mol2 files and the types of correct atoms were performed by SPORES software [60,61].

Benchmarking
Concerning docking of the prepared molecules to the active site of Fascin structures using MOE v.2019.01, the molecules were docked in the ligand binding site of the Fascin structure. Triangle matcher was set as the placement, while London dG and GBVI/WSA dG were set as the first and second rescoring functions, respectively and the refinement was via forcefield.
Regarding docking via AutoDock Vina (version 1.1.2) and VinaXB [40,42], the converting of the protein files to PDBQT files was performed by utilizing a python script known as (prepare_receptor4.py) from the MGLTools package (version 1.5.4) [62]. The dimensions of the docking grid box were 18 Å × 18 Å × 18 Å, with a 1 Å grid spacing to ensure that all docked compound geometries were covered. However, the docking method's search efficiency was retained at its default setting.
For PLANT (version 1.2) docking [41], "ChemPLP," was the employed scoring function with selecting "screen" mode. Within 5 Å of the co-crystal ligand coordinates, the binding site was identified.

Virtual Screening of DrugBank FDA-Approved Drugs and NANPDB Molecules
PLANTS was chosen as the docking program for virtual screening due to its significant performance in the benchmarking study. VS was carried out by docking FDA-approved drugs from DrugBank and NANPDB molecules into the prepared Fascin crystal structure in complex with the co-crystallized ligand BDP-13176 (PDB ID: 6I18).

pROC and pROC-Chemotype Calculations
The score-based docking rank was employed in the calculation of pROC-AUC utilizing the KNIME "R-Snippet" component [63] according to the following Equation (1) [64]: where n is the bioactives number while the decoys fraction ordered higher than ith active identified is represented by Di where ith is the bioactive number in the rank, and where ith represents the bioactive number in the rank. The plots of pROC-Chemotype were generated by the tool "pROC-Chemotype plot" which is available at http://www.dekois.com/ (accessed on 15 July 2022) [45,46].
To evaluate the docking program's ability to detect true-positive actives, in the list of the docking rank in comparison to the random collection, the enrichment factor (EF) was calculated according to the subsequent Equation (2) [65]: Bioactives total N total (2) The figures of protein structure were rendered using MOEv.2019.01 and Pymol [66].

Molecular Dynamics Simulations
Molecular dynamics simulations were conducted using GROMACS 2020.3 [67]. The solvation of each protein-ligand complex was carried out in a dodecahedron box of TIP3P explicit water model [68]. Then, the system was neutralized using NaCl ions with ionic strength of 0.1 M concentration. For system energy minimization, the steepest descent minimization algorithm was utilized by a convergence set at 10 kJ/mol and 50,000 steps. At 300 K temperature and 1 atm pressure, each NVT followed by NPT equilibration was conducted for 500 ps. After that, a production run at NPT ensemble was performed for 100 ns. For each equilibration run, temperature coupling was carried out using the V-rescale modified Berendsen thermostat [69], For equilibration and production runs, a 2 ps time constant Berendsen coupling [70] was employed for pressure coupling. Furthermore, for pressure coupling, the Parrinello-Rahman pressure coupling scheme [71] was utilized for the production runs. Using the Verlet cutoff-scheme with 1.2 cutoff and 1.0 nm switch list distances was for Van der Waals calculations and searching for adjacent atoms. The method of Particle Mesh Ewald [72] was employed for the long-range electrostatics calculations within 1.2 nm. The bond lengths were constrained using the LINear Constraint Solver (LINCS) algorithm. [73]. The protein molecules' topology and parameters were generated by applying the CHARMM36 all-atom force field [74], while the ligand parameters were generated using the SwissParam server [75]. A leap-frog integrator with a steps size of 2 fs was utilized for all simulations. ProDy's Python library was used to calculate protein RMSD, RMSF, and radius of gyration [76,77], while VMD's rmsd trajectory analysis tool was used to determine ligand RMSD and hydrogen bonds [78]. GROMACS and Matplotlib python plotting library were employed for constructing all analysis charts [79].

Conclusions
Fascin is overexpressed in various carcinomas that are associated with metastasis and poor prognosis. In this study, we carried out (CADD) approaches to systematically recommend potential inhibitors of the Fascin protein. First, Fascin protein structures (PDB ID: 6I18) and (PDB ID: 6I0Z) were selected to represent the conformations of the target space of Fascin-liganded structures. Then, diverse bioactive molecules were collected from literature having different scaffolds, namely: Indazole, N-phenylacetamide, pyrazolo pyrimidin-4-one, isoquinolone, naphthyridone, pyrazolo [4,3-c]pyridine and pyridone, to compile an active set for benchmarking study. Accordingly, a set of high-quality decoys was generated via DEKOIS 2.0 protocol to be utilized in the benchmarking process against the selected Fascin structures. Four popular docking tools, MOE, AutoDock Vina, VinaXB, and PLANTS were employed in the benchmarking effort. All docking tools exhibited better-than-random performance against one Fascin structure (PDB ID: 6I18). Based on the benchmarking outcomes utilizing the pROC-AUC, and EF 1%, PLANTS exhibited the best screening performance. Visualizing chemotype enrichment of PLANTS via a pROC-Chemotype plot revealed the ability of this docking tool to enrich the potent bioactive molecules in the early enrichment. This outcome encouraged us to employ PLANTS in conducting SBVS against Fascin (PDB ID: 6I18) to repurpose FDA-approved drugs (from DrugBank) and natural products (from NANPDB). The VS results showed that Remdesivir, Lapatinib, and Fexofenadine (from DrugBank) and NANPDB1-3 (from NANPDB) can be endorsed as potential binders of the Fascin structure. Finally, to further validate the compounds' stability, we performed molecular dynamic (MD) simulations for 100 ns. MD recommended that Remdesivir from the DrugBank series and NANPDB3 from the NANPDB series to be the best potential binders to Fascin binding site 2.
Generally, our study provides an example of recruiting a DEKOIS 2.0 benchmark set as a method to elevate the success rate for further virtual screening efforts against new vital targets for anticancer and antimetastatic drug discovery.
In addition, the best-ranked repurposed molecules Remdesivir and NANPDB3 from FDA-approved drugs and natural products databases, respectively, are recommended for further biological investigations against Fascin to provide potential therapeutic agents.  Table S1: Fascin structures in the protein data bank (PDB); Table S2: The best enriched 1% of the VS results for NANPDB molecules against Fascin (PDB ID: 6I18); Video S1: Remdesivir -Fascin complex; Video S2: NANPDB3-Fascin complex. References [80,81]   Data Availability Statement: All data presented in this study are available on request from the corresponding author. The structures of the original data (active set) can be found in the article.