Identification of a Putative SARS-CoV-2 Main Protease Inhibitor through In Silico Screening of Self-Designed Molecular Library

There have been outbreaks of SARS-CoV-2 around the world for over three years, and its variants continue to evolve. This has become a major global health threat. The main protease (Mpro, also called 3CLpro) plays a key role in viral replication and proliferation, making it an attractive drug target. Here, we have identified a novel potential inhibitor of Mpro, by applying the virtual screening of hundreds of nilotinib-structure-like compounds that we designed and synthesized. The screened compounds were assessed using SP docking, XP docking, MM-GBSA analysis, IFD docking, MD simulation, ADME/T prediction, and then an enzymatic assay in vitro. We finally identified the compound V291 as a potential SARS-CoV-2 Mpro inhibitor, with a high docking affinity and enzyme inhibitory activity. Moreover, the docking results indicate that His41 is a favorable amino acid for pi-pi interactions, while Glu166 can participate in salt-bridge formation with the protonated primary or secondary amines in the screened molecules. Thus, the compounds reported here are capable of engaging the key amino acids His41 and Glu166 in ligand-receptor interactions. A pharmacophore analysis further validates this assertion.


Introduction
Since the emergence of the coronavirus disease 2019  in late 2019, it has become a significant public health concern, with confirmed cases reported across the globe, and continuing to rise. As of now, the World Health Organization (WHO) has recorded over 755 million confirmed cases, and a cumulative death toll of approximately 6.8 million [1]. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been identified as the primary pathogen responsible for COVID-19 [2][3][4], which is an RNA virus known to easily mutate into new variants with different characteristics [5][6][7]. Despite the widespread administration of vaccines, the latest variant, Omicron, has been able to evade the effects of the vaccine, and spread globally, resulting in fatalities [8,9]. Therefore, developing inhibitors against SARS-CoV-2 is of paramount importance to combatting the pandemic.
The proteolytic function of the main protease (M pro ), also known as 3-Chymotrypsinlike protease (3CLpro), plays a critical role in the replication of SARS-CoV-2 during the initial stages of viral replication. The coronavirus genome is structured as a 5 -cap and a 3 -PolyA tail, and contains 6-12 open reading frameworks (ORFs). The first ORF (ORF1a/b) makes up roughly two-thirds of the genome's length, and is responsible for producing two polyproteins, pp1a and pp1ab, via the a-1 frameshift between ORF1a and ORF1b. These polyproteins are then cleaved by the main proteases in host cells into 16 nonstructural proteins (NSPs), to generate functionally active viral replication complexes [10,11]. Therefore, Nilotinib is a second-generation inhibitor of the Bcr-Abl protein, a key kinase in chronic myeloid leukemia. In our previous research, our main focus was on the design and synthesis of inhibitors specifically targeting leukemia. Many compounds were derived from existing tyrosine kinase inhibitors, primarily nilotinib, incorporating molecular scaffolds, structural modifications, and optimization strategies [21,22]. Nilotinib primarily comprises an aromatic heterocyclic moiety containing a pyrimidine core, an aniline group substituted with trifluoromethyl and imidazole, and a linker that connects these two components ( Figure 1B). Since the compounds we developed have similar structures to nilotinib, we speculate that they may also have the potential to inhibit M pro . We have built a molecular library of our compounds, and the structures of all the compounds are given in Supplementary Materials (Table S1). In this study, we employed virtual screening technology to evaluate the binding affinity of hundreds of our previously designed compounds with M pro , with nilotinib as the positive control, and Indole guanidine derivatives as the native control [23], in order to discovery a novel M pro inhibitor (refer to Section 3.1) [24].

Virtual Screening
Protein Activity Site Analysis. In this study, we obtained the M pro structure from the PDB protein database (https://www.rcsb.org/structure/6LU7, accessed on 28 April 2022), and the native ligand for M pro was the potential molecular inhibitor N3 [18]. Designed by Haitao Yang et al. in 2005 [25], N3 is a broad-spectrum inhibitor targeting the main protein of coronavirus. By binding to the active site of the SARS-CoV-2 main protease, and interacting with the Cys145 residue through a Michael addition reaction, N3 is capable of forming a stable covalent bond that exerts an inhibitory effect on the activity of the protein.
The three-dimensional spatial structure and secondary structure of the main protein can be observed in 6LU7 (Figure 2A,B) [26][27][28][29], which consists of two domains, including 10 α-helices and 13 β-sheets. The ligand binds to the active site near β3, β4, and β16, with Cys145 and His41 as key residues. SP Docking. We conducted standard precision molecular docking (SP docking) on all 320 molecules, including the positive compound nilotinib, to assess their affinity to M pro (Table S1) within the active site. The molecular structures of the compounds were drawn using Maestro's 2D Sketcher (version v128117, Schrödinger 2018-1, New York, NY, USA), and were subsequently prepared for docking through the use of LigPrep (LigPrep, Schrödinger, LLC, New York, NY, USA). Prior to docking, the protein underwent preparation and energy optimization, with the aid of Wizard in Glide. The receptor-grid-generation tool was employed to generate a grid box centered on the native ligand N3, with adjustments made to the grid size to optimize calculations. The minimized molecules were then docked to the grid box, utilizing standard precision docking. The results indicated that most drugs had a binding energy between −5.5 to −7.5 kcal/mol ( Figure 2C), while 63 compounds exhibited a higher docking binding energy than nilotinib, with ∆G > −7.026 kcal/mol. (Tables 1 and S1).
XP Docking and MM-GBSA Evaluation. As the extra-precision docking (XP docking) mode is more advanced, it helps to filter out false positives, and provides a better association between the docking score and good poses [30]. Therefore, XP docking was employed to re-dock the top 64 molecules, including nilotinib, with M pro , and to rank compounds according to their XP binding energy (Table 1). Based on the protein and re-docked ligands, molecular mechanics/generalized Born surface area (MM-GBSA) analysis was conducted, to evaluate the free ligand binding energy of 48 compounds with a higher XP binding energy than nilotinib (∆G > 5.476 kcal/mol). The results of the analysis identified 32 molecules with higher free binding energies compared to nilotinib (dG > −48.96 kcal/mol), which were selected for further screening studies.

Further Screening through Induced-Fit Docking
The induced-fit theory was introduced by Koshland in 1995, and posits that enzymes do not have a complementary shape to the substrate, but form a complementary shape only after induction [31]. When it comes to ligand-receptor interaction, the conformation of the receptor, especially around the binding site, was also induced to be altered, better matching the shape and binding pattern of the ligand molecule. In this study, to obtain more realistic models of molecule-protein interactions, induced-fit docking (IFD) was applied for further virtual screening.
The 32 compounds were chosen to be re-docked into M pro through the application of induced-fit docking (IFD), with nilotinib as the control, and their induced-fit-docking binding energy was calculated. Multiple docking results were produced during IFD docking, due to the various poses of the compounds, and we recorded the results with the highest docking energy for each compound in Table 1. The interactions of protein-ligand complexes were then displayed and analyzed using Maestro, and the potential interactions of all 32 compounds were counted (Figures 3-8 and Table S2). In comparison to the SP and XP docking, the IFD docking facilitated stronger binding and more protein interactions with the filtered molecules, further accentuating the differences in affinity between these compounds and M pro . As shown in Table 2 (or Table S2), nilotinib interacts with M pro mainly through hydrogen bonds, consistent with what has been reported in the literature. As for the screened compounds, they are able to engage more amino acids in proteinligand interactions, and display new interactions, such as electrostatic interaction and pi-pi stacking hydrophobic interaction, compared to nilotinib.  Compound V253 and M pro Interaction. The compound V253 has a pyridine biphenyl amide structure, with serine as a linker. It can form ten hydrogen bonds (Leu141, Gly143, Ser144, Cys145, His164, Glu166, Gln189, and Gln192), two pi-pi stacking interactions (His41), and a salt bridge (Glu166), which enhances the stability of the complex. In addition to similar hydrogen bonds to nilotinib, there are also other hydrogen bonds, including: (1) the NH of the amide in the serine backbone with His164 and Gln189; (2) the OH in the serine side chain with Leu141; (3) the N atom on the pyridine with Gln192; and (4) the NH of the amide linked to the pyridine with Glu166. Notably, V253 is capable of forming two pi-pi stacking interactions with the imidazole ring of His41. Furthermore, the acyl ethylenediamine connected to the pyridine contains a primary amine which has been protonated, allowing it to form salt-bridge interactions with Glu166. Overall, V253 could bind to M pro through a hydrogen bond and pi-pi bond with His41 and Cys145, thereby impairing the activity of protease. Here, considering the diversity in structural features and binding interaction models, five molecules are selected as representatives to illustrate the molecule-protein interactions. Nilotinib was employed as a control, to further clarify the mechanism behind the compounds' capacity to generate stronger interaction forces. Compound V247 and M pro Interaction. The structure of the compound V247 closely resembles that of V253, with the only difference being the presence of a tBu protecting group on the serine sidechain of V247, thus making the docking pose and ligand-protein interactions similar to those of V253. The docking result of the compound V247 with M pro shows six hydrogen bonds (Asn142, His164, Glu166, Gln189, and Gln192), and a salt bridge with Glu166, due to the protonated ammonia in the ligand, thus forming a stable complex. Exclusively, the CO of the amide in V247 generates one hydrogen bond with Asn142, which is different to that of V253. Interestingly, despite having fewer interactions, V247 demonstrates a similar binding energy to that of V253.  Compound V133 and M pro Interaction. The V133 shares a comparable structure with the compound V253, except for the linker type of amino acid present in its molecule, which is alanine instead of serine. Thus, the interaction type of V133 with M pro is quite similar to that of V253. As shown in Figure 6, the compound V133 binds to the protein through six hydrogen bonds (Gly143, Cys145, Glu166, Arg188, and Gln189), and further promotes the stability of the complex through the salt bridge formed between the protonated ammonia (primary nitrogen in ethylenediamine) in the ligand, and Glu166. Uniquely, the CO of the amide attached to the pyridine demonstrates two hydrogen bonds, with Gly143 and Cys145, respectively. that of V253. As shown in Figure 6, the compound V133 binds to the protein through six hydrogen bonds (Gly143, Cys145, Glu166, Arg188, and Gln189), and further promotes the stability of the complex through the salt bridge formed between the protonated ammonia (primary nitrogen in ethylenediamine) in the ligand, and Glu166. Uniquely, the CO of the amide attached to the pyridine demonstrates two hydrogen bonds, with Gly143 and Cys145, respectively. Compound V228 and M pro Interaction. The compound V228 has a backbone structure with tertiary leucine as a linker. The docking result shows that the compound V228 can bind to the active site of M pro , and their interaction includes seven hydrogen bonds (His41, Asn142, His164, Glu166, His172, and Gln189). Compared to V253, the incorporation of tert-butyl into the compound V228 confers a significant steric hindrance, leading to a distinct conformation in the docking results. The amide structure attached to the pyridine exhibits hydrogen-bonding interactions with His41, His164, and Gln189, while another amide structure linked to the halogenated benzene engages in hydrogen bonding with Glu166 and Asn142. Unlike the above compounds, V228 with a cyclopropanamide structure lacks a primary amine, so it cannot form a distinctive salt-bridging interaction with Glu166. Compound V291 and M pro Interaction. Notably, the pyrrolidine of the proline in V291 provides the basis for restricting the deformation of the molecule during its binding to the protein, allowing its stable insertion into, and binding to, the protein active site. The indazole structure in V291 exhibits a stronger hydrophobic interaction, compared to the pyridine-linked pyrimidine structure in nilotinib, which facilitates ligand-protein interactions. This compound binds to M pro through nine hydrogen bonds (Thr26, His41, Asn142, Gly143, Cys145, and Glu166), three pi-pi stacking interactions (His41 and His163), and one salt bridge (Glu166). Indazole has the unique ability to form two pi-pi stacking interactions with His41, which is not found in compounds with other structural types. The hydrogen bond between the ligand and Cys145 or His41 could prevent the key residues cysteine and histidine from participating in substrate hydrolysis catalysis. Three pi-pi stacking interactions improve the hydrophobic interaction, to enhance the binding of the complex. Nilotinib and M pro Interaction. Nilotinib can bind to M pro well, in its active pocket ( Figure 3). The complex consists of seven hydrogen bonds (Glu142, Cys145, His164, Glu166, Gln 189), including (1) two N atoms on the pyrimidine with Cys145, Asn142, and Gln189, respectively; (2) NH linked to the pyrimidine with Asn142; (3) CO in amide with Asn142 and Gln189; and (4) NH in amide with Glu166. The IFD-docking binding energy of stabilization is ∆G = −9.179 kcal/mol. Based on the results above, it was found that these screened compounds interact with the M pro protein mainly through hydrogen bonding, pi-pi stacking, and salt bridges. Hydrogen bonding is the most common interaction in these molecule-protein complexes, and Gly143, Cys145, Glu166, and Gln189 are the key residues involved in hydrogen bonding interactions. Consistent with nilotinib, some compounds, such as V133, V253, and V291, can directly interact with cysteine 145 through hydrogen bonding, leading to the direct inhibition of the cysteine-mediated hydrolysis reaction. Moreover, His41 is commonly involved in pi-pi stacking with aromatic structures in docking molecules. In particular, the compound V291, with a larger indazole aromatic ring structure, exhibits stronger pi-pi stacking with His41, and hydrophobic effects, enabling His41 to participate more effectively in ligand-receptor interactions. Additionally, the secondary amine embedded in the proline of V291, and the primary amines of other compounds, facilitate the involvement of Glu166 in salt-bridge generation between the molecules and the M pro protein. Notably, for Cys145 and His41, two key amino acids involved in enzymatic hydrolysis, nilotinib only forms hydrogen bonds with cysteine, while the screened compounds can simultaneously interact with another key amino acid, His41, in pi-pi stacking, which might enhance the enzymatic inhibition. Moreover, the reported compounds could generate salt bridges with Glu166, revealing a new bind mode with M pro . These findings could provide novel perspectives for the development of M pro inhibitors.

Pharmacophore Analysis
On the basis of molecular docking, we counted and analyzed the molecules which could bind with M pro at the active site. Next, these molecules were selected for superimposition through the application of their docking conformation (as shown in Figure 9A), and we further analyzed the pharmacophore of all the molecules. As shown in Figure 9B, there are seven common components identified from the superimposition: Compound V253 and M pro Interaction. The compound V253 has a pyridine biphenyl amide structure, with serine as a linker. It can form ten hydrogen bonds (Leu141, Gly143, Ser144, Cys145, His164, Glu166, Gln189, and Gln192), two pi-pi stacking interactions (His41), and a salt bridge (Glu166), which enhances the stability of the complex. In addition to similar hydrogen bonds to nilotinib, there are also other hydrogen bonds, including: (1) the NH of the amide in the serine backbone with His164 and Gln189; (2) the OH in the serine side chain with Leu141; (3) the N atom on the pyridine with Gln192; and (4) the NH of the amide linked to the pyridine with Glu166. Notably, V253 is capable of forming two pi-pi stacking interactions with the imidazole ring of His41. Furthermore, the acyl ethylenediamine connected to the pyridine contains a primary amine which has been protonated, allowing it to form salt-bridge interactions with Glu166. Overall, V253 could bind to M pro through a hydrogen bond and pi-pi bond with His41 and Cys145, thereby impairing the activity of protease.
Compound V247 and M pro Interaction. The structure of the compound V247 closely resembles that of V253, with the only difference being the presence of a tBu protecting group on the serine sidechain of V247, thus making the docking pose and ligand-protein interactions similar to those of V253. The docking result of the compound V247 with M pro shows six hydrogen bonds (Asn142, His164, Glu166, Gln189, and Gln192), and a salt bridge with Glu166, due to the protonated ammonia in the ligand, thus forming a stable complex. Exclusively, the CO of the amide in V247 generates one hydrogen bond with Asn142, which is different to that of V253. Interestingly, despite having fewer interactions, V247 demonstrates a similar binding energy to that of V253.
Compound V133 and M pro Interaction. The V133 shares a comparable structure with the compound V253, except for the linker type of amino acid present in its molecule, which is alanine instead of serine. Thus, the interaction type of V133 with M pro is quite similar to that of V253. As shown in Figure 6, the compound V133 binds to the protein through six hydrogen bonds (Gly143, Cys145, Glu166, Arg188, and Gln189), and further promotes the stability of the complex through the salt bridge formed between the protonated ammonia (primary nitrogen in ethylenediamine) in the ligand, and Glu166. Uniquely, the CO of the amide attached to the pyridine demonstrates two hydrogen bonds, with Gly143 and Cys145, respectively. Compound V228 and M pro Interaction. The compound V228 has a backbone structure with tertiary leucine as a linker. The docking result shows that the compound V228 can bind to the active site of M pro , and their interaction includes seven hydrogen bonds (His41, Asn142, His164, Glu166, His172, and Gln189). Compared to V253, the incorporation of tertbutyl into the compound V228 confers a significant steric hindrance, leading to a distinct conformation in the docking results. The amide structure attached to the pyridine exhibits hydrogen-bonding interactions with His41, His164, and Gln189, while another amide structure linked to the halogenated benzene engages in hydrogen bonding with Glu166 and Asn142. Unlike the above compounds, V228 with a cyclopropanamide structure lacks a primary amine, so it cannot form a distinctive salt-bridging interaction with Glu166.
Compound V291 and M pro Interaction. Notably, the pyrrolidine of the proline in V291 provides the basis for restricting the deformation of the molecule during its binding to the protein, allowing its stable insertion into, and binding to, the protein active site. The indazole structure in V291 exhibits a stronger hydrophobic interaction, compared to the pyridine-linked pyrimidine structure in nilotinib, which facilitates ligand-protein interactions. This compound binds to M pro through nine hydrogen bonds (Thr26, His41, Asn142, Gly143, Cys145, and Glu166), three pi-pi stacking interactions (His41 and His163), and one salt bridge (Glu166). Indazole has the unique ability to form two pi-pi stacking interactions with His41, which is not found in compounds with other structural types. The hydrogen bond between the ligand and Cys145 or His41 could prevent the key residues cysteine and histidine from participating in substrate hydrolysis catalysis. Three pi-pi stacking interactions improve the hydrophobic interaction, to enhance the binding of the complex.
Based on the results above, it was found that these screened compounds interact with the M pro protein mainly through hydrogen bonding, pi-pi stacking, and salt bridges. Hydrogen bonding is the most common interaction in these molecule-protein complexes, and Gly143, Cys145, Glu166, and Gln189 are the key residues involved in hydrogen bonding interactions. Consistent with nilotinib, some compounds, such as V133, V253, and V291, can directly interact with cysteine 145 through hydrogen bonding, leading to the direct inhibition of the cysteine-mediated hydrolysis reaction. Moreover, His41 is commonly involved in pi-pi stacking with aromatic structures in docking molecules. In particular, the compound V291, with a larger indazole aromatic ring structure, exhibits stronger pi-pi stacking with His41, and hydrophobic effects, enabling His41 to participate more effectively in ligand-receptor interactions. Additionally, the secondary amine embedded in the proline of V291, and the primary amines of other compounds, facilitate the involvement of Glu166 in salt-bridge generation between the molecules and the M pro protein. Notably, for Cys145 and His41, two key amino acids involved in enzymatic hydrolysis, nilotinib only forms hydrogen bonds with cysteine, while the screened compounds can simultaneously interact with another key amino acid, His41, in pi-pi stacking, which might enhance the enzymatic inhibition. Moreover, the reported compounds could generate salt bridges with Glu166, revealing a new bind mode with M pro . These findings could provide novel perspectives for the development of M pro inhibitors.

Pharmacophore Analysis
On the basis of molecular docking, we counted and analyzed the molecules which could bind with M pro at the active site. Next, these molecules were selected for superimposition through the application of their docking conformation (as shown in Figure 9A), and we further analyzed the pharmacophore of all the molecules. As shown in Figure 9B, there are seven common components identified from the superimposition: three aromatic rings (R10, R11, R12, orange), three acceptors (A2, A3, A4, pink), and one donor (D6, blue). In particular, R10 and R11, A3, and D6 are deeply embedded in the pocket of the active site, thereby providing steric hindrance to substrate binding to the protein. Besides, A2 and A4 are located near the key cysteine amino acid, to prevent the substrate from reacting with Cys145. Moreover, R12, which is frequently occupied by benzene rings, could exhibit hydrophobic interactions with the protein on the left side of the pocket. Meanwhile, R10s are predominantly captured by aromatic rings, allowing for effective pi-pi interactions with another crucial amino acid, His41. In addition, the amide structures are mostly observed at A3 and D6, with electron-absorbing groups promoting the polarization of the NH bond to effectively act as a hydrogen-bond donor, while the carbonyl oxygen acts as a hydrogenbond acceptor, resulting in interactions with neighboring amino acids, such as Glu166. These findings are in agreement with the conclusions drawn from the ligand-receptor interactions demonstrated by the IFD-docking results.
three aromatic rings (R10, R11, R12, orange), three acceptors (A2, A3, A4, pink), and one donor (D6, blue). In particular, R10 and R11, A3, and D6 are deeply embedded in the pocket of the active site, thereby providing steric hindrance to substrate binding to the protein. Besides, A2 and A4 are located near the key cysteine amino acid, to prevent the substrate from reacting with Cys145. Moreover, R12, which is frequently occupied by benzene rings, could exhibit hydrophobic interactions with the protein on the left side of the pocket. Meanwhile, R10s are predominantly captured by aromatic rings, allowing for effective pi-pi interactions with another crucial amino acid, His41. In addition, the amide structures are mostly observed at A3 and D6, with electron-absorbing groups promoting the polarization of the NH bond to effectively act as a hydrogen-bond donor, while the carbonyl oxygen acts as a hydrogen-bond acceptor, resulting in interactions with neighboring amino acids, such as Glu166. These findings are in agreement with the conclusions drawn from the ligand-receptor interactions demonstrated by the IFDdocking results.

Molecular Dynamics Analysis
The virtual screening and molecular docking have shown the binding ability of different compounds to the main protease, while molecular-dynamics simulations can provide insight into the dynamic stability of receptor-ligand complexes under physiological conditions. Thus, based on the binding affinity, ligand-receptor complex interaction, and structural diversity of the compounds (Figure S1), 20 compounds were selected ( Figure S2) to undergo molecular-dynamics simulations, to investigate their binding stability ( Figure 10). Throughout the molecular-dynamics simulation, one frame of trajectory was generated every 100 ps, resulting in 1000 frames after 100 ns of operation. Additionally, a thorough analysis of the resulting data was conducted, including the rootmean-square deviation (RMSD), the root-mean-square fluctuation (RMSF), and a proteinligand contact analysis.
The conformation of the ligand-protein complex at 0 ns was used as the reference for the calculation of the RMSD values of all 1000 frames. During the simulation, six compounds (V131, V133, V172, V245, V247, and V253) showed poor dynamic stabilities (Figure 10), while other compounds exhibited stable binding with the M pro protein at the active site. Fluctuations in the RMSD values are usually associated with changes in ligandprotein interaction. For example, the compound V247 displayed a large variation in RMSD; it initially formed a strong bond with the Glu166 of M pro , but this weakened or even disappeared after 27 ns ( Figure S3). Figure S3A shows the three-dimensional structure of the V247-M pro complex, at 1 ns and 28 ns, respectively. It is evident that the N-(2-(dimethylamino)ethyl)nicotinamide of the compound V247 detached from the protein, and the ethylenediamine portion was entirely separated from the protein surface

Molecular Dynamics Analysis
The virtual screening and molecular docking have shown the binding ability of different compounds to the main protease, while molecular-dynamics simulations can provide insight into the dynamic stability of receptor-ligand complexes under physiological conditions. Thus, based on the binding affinity, ligand-receptor complex interaction, and structural diversity of the compounds (Figure S1), 20 compounds were selected ( Figure S2) to undergo molecular-dynamics simulations, to investigate their binding stability ( Figure 10). Throughout the molecular-dynamics simulation, one frame of trajectory was generated every 100 ps, resulting in 1000 frames after 100 ns of operation. Additionally, a thorough analysis of the resulting data was conducted, including the root-mean-square deviation (RMSD), the root-mean-square fluctuation (RMSF), and a protein-ligand contact analysis.
The conformation of the ligand-protein complex at 0 ns was used as the reference for the calculation of the RMSD values of all 1000 frames. During the simulation, six compounds (V131, V133, V172, V245, V247, and V253) showed poor dynamic stabilities (Figure 10), while other compounds exhibited stable binding with the M pro protein at the active site. Fluctuations in the RMSD values are usually associated with changes in ligand-protein interaction. For example, the compound V247 displayed a large variation in RMSD; it initially formed a strong bond with the Glu166 of M pro , but this weakened or even disappeared after 27 ns ( Figure S3). Figure S3A shows the three-dimensional structure of the V247-M pro complex, at 1 ns and 28 ns, respectively. It is evident that the N-(2-(dimethylamino)ethyl)nicotinamide of the compound V247 detached from the protein, and the ethylenediamine portion was entirely separated from the protein surface at 28 ns, rendering it unable to interact with residues, thereby leading to strong fluctuations.
As for the other 14 ligand-protein complexes with a good dynamic stability, they showed stable RMSD values over long periods of time. Four representative complexes were selected for further MD simulations over 100 ns, to verify the stability of the complexes ( Figure S4). To delve deeper into the dynamic changes in protein-ligand contact during the simulation, an analysis of the root-mean-square fluctuation (RMSF) and protein-ligand contacts was conducted (Figures S5-S11). The RMSF is useful for characterizing local changes along the protein chain, which helped us to identify the residues responsible for altering the fluctuations in the protein-ligand complex structure. For all complexes except the V222-M pro complex and the V254-M pro complex, the protein RMSF values exhibited a negligible variation, suggesting that the system was in a state of equilibrium throughout the simulation. The residues in contact with the ligand are annotated in green, and the status of their contact with the ligand was monitored every 0.2 ns, throughout the 20 ns dynamic simulation. Most compounds maintained contact with a number of particular residues during the dynamic simulation, except for the V75-M pro complex, the V97-M pro complex, and the V282-M pro complex.
As for the other 14 ligand-protein complexes with a good dynamic stability, they showed stable RMSD values over long periods of time. Four representative complexes were selected for further MD simulations over 100 ns, to verify the stability of the complexes ( Figure S4). To delve deeper into the dynamic changes in protein-ligand contact during the simulation, an analysis of the root-mean-square fluctuation (RMSF) and protein-ligand contacts was conducted (Figures S5-S11). The RMSF is useful for characterizing local changes along the protein chain, which helped us to identify the residues responsible for altering the fluctuations in the protein-ligand complex structure. For all complexes except the V222-M pro complex and the V254-M pro complex, the protein RMSF values exhibited a negligible variation, suggesting that the system was in a state of equilibrium throughout the simulation. The residues in contact with the ligand are annotated in green, and the status of their contact with the ligand was monitored every 0.2 ns, throughout the 20 ns dynamic simulation. Most compounds maintained contact with a number of particular residues during the dynamic simulation, except for the V75-M pro complex, the V97-M pro complex, and the V282-M pro complex.

ADME/T Prediction
ADME/T is a very important standard evaluation in contemporary drug design and drug screening. Any compound that exhibits drug-likeness must have moderate ADME/T properties. QikProp was used here to predict the ADME/T properties of the 20 compounds mentioned above ( Table 3). The QikProp analysis includes the following standard limits: the PSA (the Van der Waals surface area of polar nitrogen and oxygen atoms and carbonyl carbon atoms), QPlogS (the aqueous solubility, log S. S in mol dm -3 is the concentration of the solute in a saturated solution that is in equilibrium with the crystalline solid), QPlogPo/w (the octanol/water partition coefficient), donorHB, accptHB,

ADME/T Prediction
ADME/T is a very important standard evaluation in contemporary drug design and drug screening. Any compound that exhibits drug-likeness must have moderate ADME/T properties. QikProp was used here to predict the ADME/T properties of the 20 compounds mentioned above ( Table 3). The QikProp analysis includes the following standard limits: the PSA (the Van der Waals surface area of polar nitrogen and oxygen atoms and carbonyl carbon atoms), QPlogS (the aqueous solubility, log S. S in mol dm -3 is the concentration of the solute in a saturated solution that is in equilibrium with the crystalline solid), QPlogPo/w (the octanol/water partition coefficient), donorHB, accptHB, CNS (the central nervous system activity), #metab (the number of likely metabolic reactions), human oral absorption, QPlogBB (the brain/blood partition coefficient), QPPMDCK (the apparent MDCK cell permeability in nm/s), QPPCaco (the apparent Caco-2 cell permeability in nm/s), and QPlogHERG (the IC 50 value for the blockage of HEGR K+ channels). Based on the collective data, we inferred that the ADME/T properties of compounds were within the prescribed limits for a potential candidate drug compound.

Compound Enzymatic Activity Assay
Based on the MD stimulations and the ADME/T predicted results, nine compounds were selected for assessment for their inhibitory potency, and for the calculation of their IC 50 values (Table 4). We used a fluorescence resonance energy transfer (FRET)-based assay to measure these 9 compounds' inhibitory activity against the M pro protein in vitro. As shown in Table 3, the compound V291 exhibited a potent inhibitory activity, with an IC 50 value of 2.77 ± 0.56 µM, while other compounds did not exhibit significant inhibitory activity against M pro , within a concentration of 20 µM. This preliminary result provides biological evidence that the compound V291, which has a similar structure to nilotinib, could potentially exert an inhibitory effect on M pro .

Compound Enzymatic Activity Assay
Based on the MD stimulations and the ADME/T predicted results, nine compounds were selected for assessment for their inhibitory potency, and for the calculation of their IC50 values (Table 4). We used a fluorescence resonance energy transfer (FRET)-based assay to measure these 9 compounds' inhibitory activity against the M pro protein in vitro. As shown in Table 3, the compound V291 exhibited a potent inhibitory activity, with an IC50 value of 2.77 ± 0.56 μM, while other compounds did not exhibit significant inhibitory activity against M pro , within a concentration of 20 μM. This preliminary result provides biological evidence that the compound V291, which has a similar structure to nilotinib, could potentially exert an inhibitory effect on M pro .

Experimental Procedures
We have confirmed the structures of more than 300 compounds, and have compiled a database. The design concepts of the compounds, as well as the synthetic routes, have been reported in our previously published literature [21,22]. To identify compounds with a higher binding affinity than nilotinib, we used SP docking, resulting in the identification of 63 molecules. Following the SP docking results, a pharmacophore analysis was performed, to identify and examine the common features of the compounds' docking conformations. Furthermore, we redocked these 63 compounds utilizing XP docking, and MM-GBSA analysis was further used to screen a total of 32 molecules with a good binding affinity. Induced-fit docking (IFD) was performed to identify the interactions between these 32 molecules and M pro . Based on the ligand-receptor interactions, and structure type (Figures 11 and S1), 20 ligand-receptor complexes were subjected to a molecular-dynamics (MD) simulation to assess their stability, and the ADME/T properties of these 20 molecules ( Figure S2) were evaluated. After careful evaluation, nine compounds with a good MD 19.92 ± 1.27

Experimental Procedures
We have confirmed the structures of more than 300 compounds, and have compiled a database. The design concepts of the compounds, as well as the synthetic routes, have been reported in our previously published literature [21,22]. To identify compounds with a higher binding affinity than nilotinib, we used SP docking, resulting in the identification of 63 molecules. Following the SP docking results, a pharmacophore analysis was performed, to identify and examine the common features of the compounds' docking conformations. Furthermore, we redocked these 63 compounds utilizing XP docking, and MM-GBSA analysis was further used to screen a total of 32 molecules with a good binding affinity. Induced-fit docking (IFD) was performed to identify the interactions between these 32 molecules and M pro . Based on the ligand-receptor interactions, and structure type (Figures 11 and S1), 20 ligand-receptor complexes were subjected to a molecular-dynamics (MD) simulation to assess their stability, and the ADME/T properties of these 20 molecules ( Figure S2) were evaluated. After careful evaluation, nine compounds with a good MD stability and reasonable ADME/T predictions were selected, and their IC 50 against the M pro protein was evaluated. stability and reasonable ADME/T predictions were selected, and their IC50 against the M pro protein was evaluated. Figure 11. The flowchart of steps to identify the novel potential inhibitor.

Ligand Preparation
All of the molecular structures of compounds were drawn in Maestro's 2D Sketcher (Maestro v128117, Schrödinger 2018-1, New York, NY, USA). The molecular preparation and energy minimization were carried out under the OPLS3 force field, using LigPrep in Glide (Glide v91117, Schrödinger 2018-1, New York, NY, USA). With the help of Epik, the Figure 11. The flowchart of steps to identify the novel potential inhibitor.

Ligand Preparation
All of the molecular structures of compounds were drawn in Maestro's 2D Sketcher (Maestro v128117, Schrödinger 2018-1, New York, NY, USA). The molecular preparation and energy minimization were carried out under the OPLS3 force field, using LigPrep in Glide (Glide v91117, Schrödinger 2018-1, New York, NY, USA). With the help of Epik, the algorithm simulated the ionization state of molecules in the environment of pH = 7.0 ± 2.0 (Epik v56117, Schrödinger 2018-1, New York, NY, USA). The other parameters were maintained at the default, and each molecule generated, at most, 32 stereoisomers with retaining specified chiralities.

Protein Preparation and Grid Generation
The protein we used in this study was the SARS-CoV-2 main protease, and the X-ray co-crystal structure was provided by the PDB network database (https://www.rcsb.org/ structure/6LU7, accessed on 28 April 2022). The protein was processed and optimized using Protein Preparation Wizard in Glide (Glid v91117, Schrödinger 2018-1, New York, NY, USA). Specifically, the protein was preprocessed to initially screen for problems and defects in the spatial structure, and then hydrogen atoms were added through H-bond assignment. After the removal of the water or any other molecular solvent from the structure, the protein was optimized under the OPLS3 force field. The other parameters were maintained at the default. We then obtained the minimized protein structure, to generate the grid file for ligand docking. The grid box was generated with the native ligand in 6LU7 as the center, within the receptor grid generation, and its size was adjusted for the best calculation range.

Molecular Docking
The minimized molecule was docked using ligand docking in Glide (Glid v91117, Schrödinger 2018-1, New York, NY, USA). All molecules bound to the active site of the protein through SP/XP docking and flexible docking. Post-docking minimization was used to optimize the ligand-protein complexes, limiting the number of optimized ligands to less than 6. "Sample nitrogen inversions" and "Sample ring conformations" were selected, with "Flexible" ligand sampling. A bias sampling of the torsions for all the predefined functional groups was performed, through adding the Epik state penalties to the docking score. Any constraints were ignored. The docking results were analyzed and displayed in Maestro. The docking affinity energy of the compound with the protein was calculated using the following equation: where R is the Boltzmann gas constant (R = 1.987 cal/mol/K), T is the default temperature of simulated docking (T = 298 K), and K d is the binding affinity of the docking.

MM-GBSA
To evaluate the free binding energy between the protein and the docked ligand, the MM-GBSA of the Prime module (Prime v64117, Schrödinger 2018-1, LLC, New York, NY, USA) was employed [32]. The MM-GBSA dG (Molecular Mechanics/Generalized Born Surface Area dG) of the optimized receptor-ligand complex was calculated to determine the ligand-binding affinities. During the calculation of the free binding energy, the VSGB solvation model and the OPLS3e force field were applied. The binding energy was calculated based on the following equation: The MM-GBSA calculations entailed maintaining the rigidity of all protein atoms, whilst relaxing the compound's atoms. Furthermore, the ranking of the protein-compound complexes was carried out using binding-free-energy calculations.

Induced-Fit Docking (IFD)
The proteins and ligands were prepared in Maestro, as mentioned in the method above, followed by the induced-fit docking (Induced Fit Docking, Schrödinger2018-1, LLC, New York, NY, USA). The prepared ligands file was imported, and the standard protocol was selected, with the OPLS3 as the force field due to its wider parameter range and significant improvements. The native ligand was selected as centroid, to generate a box of automatically generated size. We ignored constraints and, in the Ligand options, we ticked "Sample ring conformations", with 2.5 kcal/mol as the energy window. In Prime Refinement, we refined residues within 5.0 Å of ligand poses; with Optimize, "Side chains" was ticked; and in Glide Redocking, we choose "SP docking" for the precision. We redocked the structures within 30.0 kcal/mol of the best structure, and within the top 20 structures overal1. Other parameters were maintained at the default. The docking results were displayed in Maestro (Maestro v128117, Schrödinger 2018-1, New York, NY, USA), and their ligand-receptor interaction was analyzed, including hydrogen bonds (within 3.5 Å), halogen bonds (within 3.5 Å), salt bridges (within 5.0 Å), pi-pi stacking (within 5.5 Å), and pi-action (within 6.6 Å).

Pharmacophore Analysis
Based on the docking results of all of the compounds, a pharmacophore analysis was performed in Maestro. The protein-ligand complexes were imported, and used to create a pharmacophore model within the Develop Pharmacophore Model module (Phase, Schrödinger 2018-1, LLC, New York, NY, USA). "R (Aromatic Ring)", "A (H-bond acceptor)", and "D (H-bond donor)" were picked to be highlighted as features, and then a receptor-based excluded-volume shell was created, using the default parameters.

ADME/T Prediction
The drug-likeness of all the compounds was evaluated using QikProp (QikProp v68117, Schrödinger 2018-1, LLC, New York, NY, USA), to determine their absorption, distribution, metabolism, excretion, and toxicological (ADME/T) properties. The analysis results of the 20 selected compounds are shown in Table 3.

Molecular Dynamics Analysis
We carried out molecular-dynamics (MD) simulation studies, using the Dynamics module (Desmond v53011, Schrödinger 2018-1, LLC, New York, NY, USA) of Schrodinger. At the beginning, the complex was processed using the Protein Prepare Module, according to the default parameters. The bonding information was corrected, hydrogen atoms were added, and water molecules were removed, in order to obtain the minimized protein structure. Next, using the System Builder plate, the processed protein was imported, to be solvated with the water model of TIP3P and the force field of OPLS3. In order to ensure that the complex was completely wrapped in the simulated solvent environment, we set some parameters of boundary conditions, including the box shape of "Orthorhombic", the box-size calculation method of "Buffer", and "Minimize Volume". The Na+ was added to neutralize the negative charge of the protein. Then, we used molecular dynamics to simulate the dynamic simulation of the complex, with default parameters. Finally, we used the function of the Simulation Interactions Diagram to analysis the output of the molecular dynamics. The root-mean-square deviation (RMSD) was used to measure the average change in displacement of a selection of atoms, for a particular frame concerning a reference frame. The RMSD value at any time can be calculated using the following equation: (r i (t x ) − r i (t re f )) 2 (3) where N is the number of atoms in the atom selection; t ref is the reference time (typically, the first frame is used as the reference, and it is regarded as time t = 0); and r is the position of the selected atoms in frame x after superimposing on the reference frame, where frame x is recorded at time t x . The procedure is repeated for every frame in the simulation trajectory. The RMSF value at any time can be calculated using the following equation: where T is the trajectory time over which the RMSF is calculated; t ref is the reference time; r i is the position of residue i; r is the position of the atoms in residue i after superposition on the reference; and the angle brackets indicate that the average of the square distance has been taken across the selection of atoms in the residue.

In Vitro Enzymatic and Inhibition Assay
The half-maximal inhibitory concentration (IC 50 ) assay was performed based on the fluorescence resonance energy transfer (FRET) effect [33][34][35][36][37]. The fluorescent peptide MCA-AVLQSGFR-Lys(Dnp)-Lys-NH 2 was used here as the substrate. The quantities of enzyme and substrate used, as well as the reaction times, were optimized, to meet the requirement that the enzyme activity be in a linear phase. The M pro proteins (0.2 µM) were added to 20 mM Tris-HCl (pH 7.3) with 150 mM NaCl, and added to a black 96-well plate, at 91 µL/well. The compounds were dissolved in DMSO, and gradient-diluted with PBS, and then added to a plate at 5 µL/well, and incubated with M pro at 37 • C for 30 min. Then, 20 µM substrates were added into each well, with 4 µL/well. Positive controls (consisting of only the enzyme and substrate, without an inhibitor) and blank controls (containing only the substrate, without the enzyme) were also included in the experiments. Next, the samples were incubated in the dark at 37 • C for 10 min, and the OD intensities were then read at λ ex = 320 nm and λ em = 405 nm, using a microplate reader (Thermo Scientific, Waltham, MA, USA). Prism software was used to calculate the IC 50 value of the compounds in the inhibition of main protease activity. Due to the instantaneous initiation of the reaction upon the substrate addition, it is essential to perform the operations as rapidly as possible, to avoid the complete catalysis of the substrate. The pre-experiment of this study has confirmed that the enzymatic catalysis time for all the substrates is sufficient to complete all operations, after the substrate addition (t > 15 min).

Conclusions
COVID-19 has become a global public health crisis that continues to threaten the lives of people worldwide. The main protease (M pro ), which is critical in the proteolytic processing of polyproteins, and facilitating viral assembly, has gained much attention as a drug target. In this study, we aimed to find a potential inhibitor for the M pro of SARS-CoV-2, based on the reported inhibitor molecule nilotinib. We designed and synthesized over 300 compounds with structures similar to nilotinib, and screened them using moleculardocking technology (SP, XP, and IFD) and MM-GBSA analysis, to choose the potential compounds. We investigated the binding stability between the compounds and proteins using molecular dynamics, and predicted the drug-likeness of the molecules using QikProp. After assessing the stability and ADME/T predictions, nine compounds were identified as having strong potential, and were chosen for subsequent bioactivity evaluations. Our results show that the compound V291 is the most promising inhibitor, with an IC 50 value of 2.77 ± 0.56 µM. Additionally, in silico simulations revealed that His41 and Glu166 could engage in interactions with ligands, through pi-pi stacking and salt bridges, respectively. The pharmacophore analysis further confirmed these findings, which required the inhibitors possessing aromatic moiety and protonated nitrogen atoms to interact with His41 and Glu166. These findings provide valuable insights into the interactions between Bcr-Abl protease inhibitors and SARS-CoV-2 M pro , thus serving as a promising reference for the development of novel anti-SARS-CoV-2 agents. The compound V291 could be a potential candidate for further optimization as an M pro inhibitor.