Next Article in Journal
Berberine in Bowel Health: Anti-Inflammatory and Gut Microbiota Modulatory Effects
Previous Article in Journal
From Bacterial Diversity to Zoonotic Risk: Characterization of Snake-Associated Salmonella Isolated in Poland with a Focus on Rare O-Ag of LPS, Antimicrobial Resistance and Survival in Human Serum
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Designing Novel Compound Candidates Against SARS-CoV-2 Using Generative Deep Neural Networks and Cheminformatics

1
Graduate Institute of Public Health, College of Public Health, National Defense Medical University, Taipei City 114201, Taiwan
2
Institute of Preventive Medicine, National Defense Medical University, New Taipei City 237010, Taiwan
3
Graduate Institute of Medical Sciences, College of Medicine, National Defense Medical University, Taipei City 114201, Taiwan
4
School of Pharmacy, College of Medicine, National Cheng Kung University, Tainan 70101, Taiwan
5
College of Pharmacy, National Defense Medical University, Taipei City 114201, Taiwan
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2025, 26(24), 12017; https://doi.org/10.3390/ijms262412017
Submission received: 14 October 2025 / Revised: 4 December 2025 / Accepted: 10 December 2025 / Published: 13 December 2025
(This article belongs to the Section Molecular Pharmacology)

Abstract

The COVID-19 outbreak has had a tremendous socioeconomic impact around the world, and although there are currently some drugs that have been granted authorization by the U.S. FDA for the treatment of COVID-19, there are still some restrictions on their use. As a result, it is still necessary to urgently carry out related drug development research. Deep generative models and cheminformatics were used in this study to design and screen novel candidates for potential anti-SARS-CoV-2 small molecule compounds. In this study, the small molecule structure of Molnupiravir which has been authorized by the U.S. FDA for emergency use was used to be a model in a similarity search based on the BIOVIA Available Chemicals Directory (BIOVIA ACD) database using the BIOVIA Discovery Studio (DS) software (version 2022). There were 61,480 similar structures of Molnupiravir, which were used as training dataset for the deep generative model, and then the reinforcement learning model was used to generate 6000 small molecule structures. To further confirm whether those molecule structures potentially possess the ability of anti-SARS-CoV-2, cheminformatics techniques were used to assess 38 small molecule compounds with potential anti-SARS-CoV-2 activity. The suitability of 38 small molecule structures was calculated using ADMET analysis. Finally, one compound structure, Molecule_36, passed ADMET and was unpatented. This study demonstrates that Molecule_36 may have better potential than Molnupiravir does in affinity with SARS-CoV-2 RdRp and ADMET. We provide a combination of generative deep neural networks and cheminformatics for developing new anti-SARS-CoV-2 compounds. However, additional chemical refinement and experimental validation will be required to determine its stability, mechanism of action, and antiviral efficacy.

1. Introduction

SARS-CoV-2 (Severe Acute Respiratory Syndrome Coronavirus 2) is the virus that caused the COVID-19 pandemic after a pneumonia cluster occurred in Wuhan, China, in December 2019 [1]. By March 2023, 670 million people had been diagnosed, and over 6 million died, with a fatality rate of 1.02 [2].
However, given the lack of a promising therapeutic approach for COVID-19, more research about the development of drug candidates is needed [3]. Although there are currently some drugs that have been used for the treatment of COVID-19, there are some restrictions on their use [4,5]. As a result, other oral drugs for the treatment of COVID-19 are urgently needed to relieve the spread of the pandemic.
Conventional drug development is a long, costly, and arduous process [6]. Many external factors may cause promising candidates to be eliminated during the process; accordingly, only about 5 of the 5000 drug candidates will advance to clinical trials, and only about 1 will be approved for marketing [7]. Artificial intelligence-based drug development is regarded as an effective tool in reducing the time and cost required for conventional drug development [8]. For example, in target identification, artificial intelligence (AI) can be used to integrate databases to easily understand the relationship between disease and drug activity or to efficiently process large amounts of chemical data in order to further optimize absorption, distribution, metabolism, excretion, and toxicity (ADMET) profiles [9]. Currently, the majority of drug development research is carried out using the principles and techniques of Machine Learning (ML) in the field of AI [10], among which the most common methods are reinforcement learning (RL), supervised learning, and unsupervised learning [11]. AI techniques may effectively and quickly exclude drugs that are not potentially active, improving the efficiency of drug development and reducing the time and cost required [12]. Particularly, deep neural networks (DNNs) have increasingly supported medicinal chemistry across early discovery, from de novo molecular generation to property optimization. Generative models, for example, variational autoencoders (VAE), GANs (Generative Adversarial Network), and RL frameworks, have been used to design drug-like small molecules under synthesizability and drug-likeness constraints [13,14,15]. Gómez-Bombarelli et al. introduced the classic VAE framework for molecular generation/optimization [13]. De Cao and Kipf’s MolGAN shows GAN-based molecular graph generation [14]. Olivecrona et al. demonstrated deep RL for molecular de novo design with property constraints [15].
In September 2019, Insilico Medicine released a deep generative model (Generative tensorial reinforcement learning, GENTRL), a deep learning method for performing de novo drug design [16]. Zhavoronkov et al. identified a novel potent inhibitor of discoidin domain receptor 1 (DDR1) in 21 days using GENTRL. The study took only 46 days from screening to synthesis and cost a tremendous fraction of the cost of conventional drug development. The novel compound designed by Zhavoronkov has now completed the Phase I clinical trial, the first AI-based inhibitor to be approved. In addition, generative deep learning has been used to rapidly propose novel SARS-CoV-2 inhibitor candidates, with Zhavoronkov et al. targeting the viral 3CLpro early in the COVID-19 pandemic [17]. More recently, Transformer-based de novo design frameworks have been employed to generate small antiviral molecules directed against the SARS-CoV-2 RNA-dependent RNA polymerase (RdRp) [18]. Parallel advances in AI-assisted drug-target interaction modeling further enable rapid triage of large chemical spaces by coupling learned molecular representations with protein sequence/structure features. These directions collectively motivate integrating deep generative modeling with downstream silico screening to prioritize tractable candidates. The development of deep generative models for de novo drug design is of specific interest for the generation of novel small molecule compounds with drug properties that can bind to protein targets, making them well suited for the urgent SARS-CoV-2 drug development research [19].
Cheminformatics methods are essential for narrowing AI-generated chemical space into experimentally testable leads, as demonstrated in numerous studies on SARS-CoV-2 antiviral discovery [18,19,20,21,22,23]. For example, de novo designed molecules have been prioritized through integrated cheminformatics workflows combining pharmacophore matching [24], structure-based molecular docking to viral targets [19,21], binding free-energy estimation or scoring [21,22], and early ADMET and drug-likeness filtering to yield synthesis-ready candidates [24]. These examples highlight how cheminformatics can efficiently down-select AI-generated candidates by evaluating their target engagement, physicochemical properties, and medicinal chemistry feasibility before synthesis. Pharmacophore-guided filtering, structure-based docking, and drug-property screening have been widely incorporated into drug design pipelines to refine computationally generated molecules into feasible antiviral leads [25].
Among AI methods used in drug discovery (e.g., predictive drug-target interactions models, similarity search), generative deep neural networks (VAEs, GANs, and RL-augmented frameworks) are uniquely suited to create novel, drug-like molecules beyond enumerated libraries while optimizing multiple objectives (activity, ADMET, and synthesizability) in a single workflow [19,20]. Prior studies demonstrated that generative models can rapidly deliver synthesis-ready chemotypes: GENTRL produced potent DDR1 inhibitors on an accelerated timeline [13]; GAN/RL-based de novo design generated candidate inhibitors for SARS-CoV-2 3CLpro early in the pandemic [17]; and Transformer-based generators have been applied to antiviral/RdRp-oriented design [18]. These successes, together with the ease of coupling generative outputs to pharmacophore-guided filtering and structure-based docking, motivated our selection of a modified GENTRL (mGENTRL) as the generative core of our pipeline in this study.

2. Results

2.1. Generation of New Chemical Structures

The normal structural chemical formula of Molnupiravir was shown in Supplementary Figure S1A. We used DS software to predict any resonance structures of Molnupiravir, and two resonance structures were generated, of which the lower energy structure was chosen for further analysis (Supplementary Figure S1B). For generation of Molnupiravir-like structures in the training dataset of the mGENTRL model, a ligand-based similarity search of the preprocessed Molnupiravir resonance structure was conducted using the ACD database. A total of 61,480 compounds with similar structures to Molnupiravir were generated (Supplementary Figure S2). Next, the database of similar structures was fed into mGENTRL model, and the reinforcement learning mechanism was used to recognize and generate chemical structures matching the reward function. To validate whether this model produced valid SMILES, we examined the percentages of valid SMILES products and obtained a highly successful average rate at 98.87% of valid SMILES, suggesting its ability to produce valid SMILES. Finally, this model sampled 6000 chemical structures, examples shown in Figure 1.

2.2. Pharmacophore-Based Virtual Screening and Molecular Docking

A pharmacophore model based on Molnupiravir was created using DS software.Two hydrogen-bonded donors, three hydrogen-bonded acceptors, and two hydrophobic chemical features were identified as shown in Figure 2. Based on this pharmacophore model, 6000 chemical structures from mGENTRL were matched with pharmacophore features. There were 277 chemical structures with similar chemical properties to Molnupiravir identified. Subsequently, 277 chemical structures were docked to SARS-CoV-2 RdRp to identify potential candidate small molecules with a binding affinity greater than that of Molnupiravir. The protein used for molecular docking in this study was preprocessed, and the active site was configured as shown in Supplementary Figure S3, which contained the major amino acids that have been identified as potent inhibitors of SARS-CoV-2 RdRp, such as GLY616, TRP617, ASP618, TYR619, PRO620, LYS621, CYS622, LEU758, SER758, and CYS622. LEU758, SER759, ASP760, ASP761, ALA762, ALA797, LYS798, CYS799, TRP800, HIS810, GLU811, PHE812, CYS813, SER814, and GLN815, etc. Furthermore, ASP760, ASP761, LYS798, and SER814 were noticed to be essential amino acids for stabilizing the core structure of RdRp in relevant studies [26].
Subsequently, Molnupiravir and 277 chemical structures were docked to the active site of 7OZV. There were 38 chemical structures with better binding affinity than that of Molnupiravir (Supplementary Table S1).

2.3. Prediction of Properties of Blood–Brain Barrier and Intestinal Absorption Using DS

As shown in Supplementary Table S1, 38 small molecules screened by molecular docking simulation were analyzed to predict human intestinal absorption and BBB permeability using DS. Finally, Molecule_36 (the IUPAC name: (S)-1-ethoxy-1,3-dioxo-3-((1-oxo-1-(propylamino)propan-2-yl)amino)propan-2-ide), a small molecule compound with potential drug properties, was predicted to have good human intestinal absorption as well as low BBB permeability. The 2D structural chemical formula of Molecule_36 was shown in Figure 3.

2.4. Prediction of Molecule_36 Target Amino Acids of RdRp and Comparison with That of Molnupiravir

To further confirm which amino acids of RdRp were the targets of Molecule_36 and compare whether these targets were similar or differential from that of Molnupiravir, we conducted extra molecular docking simulations. Figure 4 depicted the molecular docking simulation results of Molnupiravir and Molecule_36. Molnupiravir formed four types of interaction bonds, including a Conventional Hydrogen Bond (TRP617, ASP761, ALA762, GLU811, and SER814), Carbon Hydrogen Bond (LYS798 and G15), Pi-Anion, and Unfavorable Donor–Donor (ALA762). Molecule_36 contained five types of interaction bonds, including Conventional Hydrogen Bonds (GLU811 and SER814), Carbon Hydrogen Bonds (TRP617, ASP761, and G15), Pi-Alkyl (TRP800), Alkyl (LYS798), and Unfavorable Negative–Negative (GLU811). There were several common points between Molnupiravir and Molecule_36, including Conventional Hydrogen Bond (GLU811 and SER814) and Carbon Hydrogen Bond (G15). Furthermore, Molecule_36 also formed interaction bonds with target amino acids (ASP761, LYS798, and SER814) of SARS-CoV-2 RdRp, suggesting its potential to inhibit SARS-CoV-2.

2.5. Comprehensive Prediction of ADMET Property of Molecule_36 and Molnupiravir Using Multiple Resources

Finally, tools, SwissADME, pkCSM, and DS software, were used to predict, analyze, and compare the ADMET properties of Molecule_36 and Molnupiravir shown in Table 1 [27]. Molecule_36 and Molnupiravir were both predicted to have good water solubility [28], and Molecule_36 had good intestinal absorption in humans [29]. Furthermore, Molecule_36 had a high Caco-2 permeability, implying that it had good drug absorption [30]. Both Molecule_36 and Molnupiravir were predicted to be non-P-glycoprotein (P-gp) substrates, implying that they would not be excluded by P-gp and could achieve drug absorption and bioavailability [31]. Bioavailability scores for Molecule_36 and Molnupiravir were similar [32].
Molecule_36 had a low steady-state volume of distribution (VDss) due to intracorporeal distribution properties, implying that its distribution in vivo was primarily into plasma rather than tissues [33]. Furthermore, because the BBB and the central nervous system were less permeable [34], it was speculated that Molecule_36 was unable to penetrate the BBB and enter the central nervous system. According to multiple tools predictions, Molecule_36 and Molnupiravir were not inhibitors of cytochrome P450 metabolizing enzymes (CYP1A2/CYP2C19/CYP2C9/CYP2D6/CYP3A4) [35,36]. Therefore, it was speculated that the possibility of adverse drug reactions and interactions was low [37]. Total clearance was the ratio of small molecule clearance in the liver and kidney. The total clearance of Molecule_36 was predicted to be greater than that of Molnupiravir, so it was assumed that Molecule_36 was superior to Molnupiravir due to its characteristic of being excreted from the body [38]. However, neither Molecule_36 nor Molnupiravir was a substrate for renal organic cation transporters 2 (OCT2), so it may not be possible to eliminate them by transporting them to the kidney via OCT2 [39].
Molecule_36, as predicted by DS, may not be toxic, whereas Molnupiravir was. The LD50 was the acute toxicity level of a chemical [40] and was predicted to be 6.58 g/kg for Molecule_36 and 3.03 g/kg for Molnupiravir, indicating that Molecule_36 was less toxic than Molnupiravir. Furthermore, neither was predicted to be carcinogenic or mutagenic. The pkCSM predicted that Molecule_36 and Molnupiravir had no potential to inhibit hERG (human ether-a-go-go gene). One of the causes of long QT syndrome was hERG-coded potassium channel inhibition, which may increase the risk of arrhythmias [41]. Moreover, Molecule_36 was predicted to not be skin sensitizing nor to exhibit hepatotoxicity.
SwissADME predicted a Synthetic Accessibility (SA) score of 2.5 for Molecule_36 and 4.49 for Molnupiravir. The SA score ranges from 1 to 10, with lower scores indicating simpler synthesis [42]. According to PubChem, Molecule_36 is a novel and unpatented small molecule compound. As a result, we conclude that Molecule_36 may be a potential novel anti-SARS-CoV-2 RdRp small molecule compound. Detailed data are shown in Supplementary Tables S2–S5.

2.6. Analysis of Molecule_36 Molecular Dynamics Simulations

RMSD (Root mean square deviation) trajectories were calculated for the RdRp-Molecule_36 complexes throughout 100 ns NAMD (Nanoscale Molecular Dynamics) production runs. For the reference RdRp structure (7OZV, black trace), the RMSD increased during the initial equilibration phase and reached ~2–3 Å within the first 5 ns. Thereafter, the trajectory remained stable, fluctuating narrowly around 3 Å for the remainder of the 100 ns simulation, consistent with an equilibrated polymerase core as reported in previous molecular dynamics (MD) simulations of SARS-CoV-2 RdRp complexes [43,44] (Figure 5). For the RdRp-Molecule_36 complexes, the RMSD rose rapidly during the first 5 ns to approximately 7–8 Å, reflecting early relaxation and adjustment of the protein–ligand complexes (Figure 5). From ~10 ns onward, the RMSD fluctuated within a relatively stable window of ~7–9 Å, with only a transient increase around 60–70 ns that subsequently relaxed back to the same range. Overall, the RMSD showed no persistent drift or structural disruption, suggesting that the complexes remained stable over the 100 ns simulation.
Results of two independent molecular dynamics simulations of Molecule_36 were shown.

3. Discussion

3.1. Comparative Interaction Analysis of Molnupiravir and Molecule_36

In this study, a workflow integrating a modified deep generative model (mGENTRL) and cheminformatics analysis was applied to identify potential SARS-CoV-2 RdRp inhibitors using Molnupiravir as a reference structure. Through ligand-based similarity search, deep generative modeling, pharmacophore-guided virtual screening, and structure-based docking, Molecule_36 was prioritized as a tractable lead with encouraging pharmacological and safety profiles. RdRp is indispensable for SARS-CoV-2 RNA replication and transcription [45]. Its lack of human homologs makes it a highly selective antiviral target, allowing the design of inhibitors with strong activity and minimal off-target effects [46]. Comparative docking analysis indicated that Molecule_36 interacts with catalytically important residues of RdRp, including TRP617, ASP761, LYS798, GLU811, and SER814, which overlap with the binding sites reported for Remdesivir and Molnupiravir [16]. Specifically, both Molnupiravir and Molecule_36 shared key binding features, including a Conventional Hydrogen Bond (GLU811 and SER814) and Carbon Hydrogen Bond (G15). In Ahmad’s docking analyses, the guanosine moiety of the tested nucleoside/ligand was reported to form hydrogen bonds with GLU811 and SER814, as well as with ASP761 and ALA762, directly identifying GLU811 and SER814 as key hydrogen-bonding residues that stabilize ligand binding [47]. Another study demonstrated that screened RdRp inhibitors formed multiple hydrogen bonds with GLU811 and SER814 and established a salt bridge with GLU811, indicating that GLU811 not only participates in hydrogen bonding but can also engage in electrostatic interactions to stabilize the ligand [48]. Other molecular docking and dynamics simulations of SARS-CoV-2 RdRp have consistently observed GLU811 and SER814 as common contact residues for nucleoside triphosphates or designed antagonists, supporting their roles in substrate positioning and maintenance of the catalytic-site geometry [49]. In our docking model, a carbon–hydrogen bond interaction involving G15 was observed within the RdRp-Molecule_36 complex. Although G15 is located in the nidovirus RdRp-associated nucleotidyltransferase (NiRAN) domain rather than the catalytic palm subdomain, this residue may help stabilize the local conformation through weak C–H···O contacts, thereby supporting ligand anchoring and maintaining overall RdRp structural integrity [50,51]. This interaction may indicate cross-domain stabilization bridging the NiRAN and palm subdomains, suggesting a structural coupling between catalytic and regulatory regions that contributes to RdRp conformational stability. In addition to the shared hydrogen-bonding interactions at GLU811 and SER814, both ligands exhibited distinct secondary interaction networks that contribute to the overall stability of the RdRp-ligand complex.
For Molnupiravir, four categories of interactions were identified. First, conventional hydrogen bonds were formed with TRP617, ASP761, ALA762, GLU811, and SER814, residues situated within the palm subdomain that are essential for template stabilization and metal-ion coordination during RNA chain elongation [26,47,52,53]. Second, carbon–hydrogen bonds with LYS798 and G15 further reinforced the ligand’s anchoring by adding weak electrostatic and hydrophobic stabilization across both the palm and NiRAN domains [26,50,51]. Third, a Pi-Anion interaction was observed, indicating an electrostatic attraction between the aromatic ring of the ligand’s base moiety and a nearby negatively charged residue, which helps orient the ligand within the active site pocket [54,55,56,57]. Finally, an unfavorable donor–donor interaction was noted at ALA762, suggesting a potential electrostatic repulsion between overlapping hydrogen donors; this may reflect transient or non-productive binding geometries inherent to the nucleoside analog scaffold of Molnupiravir [26,47,58,59]. In contrast, Molecule_36 displayed five types of interactions. It retained the conventional hydrogen bonds with GLU811 and SER814, preserving the key anchoring contacts characteristic of nucleoside-like ligands [47]. The molecule also formed carbon–hydrogen bonds with TRP617, ASP761, and G15, extending the interaction network toward both catalytic and structural domains [26,51]. A Pi-Alkyl interaction with TRP800 and an Alkyl contact with LYS798 introduced additional hydrophobic stabilization, which likely enhances conformational rigidity and reduces solvent exposure of the ligand [26,60]. Moreover, a localized electrostatic repulsion between negatively charged carboxylate groups near GLU811 was detected in our docking model, which may transiently arise when both ligand and residue possess proximal negative centers [47]. Such negative–negative contacts are classified as unfavorable electrostatic interactions that can transiently destabilize the complex but are frequently observed in dynamic docking ensembles [58]. Nevertheless, this type of local charge repulsion can also act as a subtle electrostatic steering or fine-tuning mechanism, guiding ligands toward an energetically favorable orientation within the catalytic cavity [61]. Collectively, both Molnupiravir and Molecule_36 engage a conserved network of interactions within the palm subdomain, notably through hydrogen bonds with GLU811 and SER814, residues essential for RNA template stabilization and catalysis. These shared contacts highlight a common anchoring mechanism consistent with nucleoside-like inhibitors targeting SARS-CoV-2 RdRp. However, Molecule_36 differs from Molnupiravir in several important aspects. It extends the interaction network toward the NiRAN domain via a C–H···O contact with G15 and introduces additional hydrophobic (Pi-Alkyl and Alkyl) interactions involving TRP800 and LYS798, which may enhance binding stability and conformational rigidity. In contrast, Molnupiravir primarily relies on a denser hydrogen-bonding network but also exhibits an unfavorable donor–donor contact at ALA762, reflecting the limitations of its nucleoside analog geometry. Overall, these comparisons suggest that while both ligands share critical palm-site anchoring motifs, Molecule_36 achieves a broader and potentially more stable binding profile through cross-domain and hydrophobic stabilization, indicating a promising optimization of Molnupiravir’s pharmacophoric framework for future antiviral design.

3.2. Biological and Pharmacological Implications

The observed interaction profile of Molecule_36 carries several biological and pharmacological implications. By maintaining hydrogen-bond contacts with GLU811 and SER814, which are central residues to RNA template alignment and catalysis [47], the molecule is predicted to interfere directly with the elongation phase of viral RNA synthesis. Its additional hydrophobic and C–H···O contacts extending into the NiRAN domain may further stabilize the polymerase in a non-productive conformation, potentially hindering the transition between initiation and elongation cycles; such weak hydrogen-bond-like and hydrophobic interactions are known to stabilize protein–ligand complexes, and cross-domain coupling between the NiRAN and polymerase core has been shown to contribute to RdRp structural stability and catalytic regulation [45,62,63,64,65].
From a drug-design perspective, Molecule_36 combines key pharmacophoric features of Molnupiravir with improved physicochemical and ADMET properties, supporting its tractability as a small-molecule antiviral lead. The MD analysis provides additional insight into the interaction behavior of Molecule_36 and the RdRp. The reference RdRp structure (7OZV) displayed a short equilibration phase, reaching an RMSD of ~2–3 Å within the first 5 ns, after which it remained tightly stabilized around 3 Å for the remainder of the 100 ns simulation. This stable plateau is consistent with prior MD studies showing that SARS-CoV-2 RdRp maintains a rigid and well-packed polymerase core due to extensive intra-domain hydrogen bonding and conserved catalytic architecture [43,66]. The stability of the reference trajectory indicates that the system preparation and simulation parameters were appropriate and that the polymerase backbone remained structurally intact throughout the simulation. By comparison, the RdRp-Molecule_36 complexes showed a higher but still bounded RMSD profile. RMSD increased rapidly during the first 5 ns to ~7–8 Å, reflecting early relaxation and accommodation of the ligand within the binding pocket. From approximately 10 ns onward, the RMSD fluctuated within a relatively stable range of ~7–9 Å, with only a transient rise near 60–70 ns that subsequently relaxed back into the same window. This stable yet dynamic behavior is consistent with prior MD studies which confirm that RdRp complexes reach a dynamic equilibrium and maintain stability throughout the simulation trajectory while preserving a structurally conserved polymerase core [67,68]. The broader motions observed for Molecule_36 likely reflect its more extended interaction network, spanning both the palm and NiRAN domains, as suggested by the docking results. Prior MD studies of RdRp-ligand complexes have similarly reported cross-domain adjustments when inhibitors engage residues beyond the canonical palm site [67,68]. Weak C–H···O interactions with G15 and hydrophobic contacts with TRP800 may contribute to these local rearrangements, consistent with structural analyses showing that such weak hydrogen-bond-like contacts and hydrophobic interactions frequently support ligand stabilization and enable subtle induced-fit adaptations in protein–ligand complexes [63,64,69]. This behavior may also be functionally relevant, as previous RdRp studies have demonstrated that ligands promoting moderate backbone mobility can interfere with template translocation or catalytic synchronization rather than acting solely through chain termination [51,70].
The integration of deep generative modeling with cheminformatics-based filtering represents a promising strategy for rapidly exploring novel chemical spaces around validated antiviral scaffolds. These findings highlight how AI-assisted molecular generation, guided by structural insights into RdRp, may accelerate the discovery of selective inhibitors with enhanced binding stability and reduced off-target liability.

3.3. AI-Assisted Drug Design Framework

In recent years, AI-assisted drug discovery has reshaped early-stage antiviral research by integrating molecular representation learning, generative modeling, and structure-based evaluation within a unified workflow [71]. Strategies such as ligand-based similarity search, pharmacophore-based virtual screening, and molecular docking remain foundational [72]; however, their strength lies mainly in the repurposing of existing drugs like Remdesivir, Molnupiravir, Galidesivir, Ribavirin, Sofosbuvir, Tenofovir, and Favipiravir [73]. In contrast, AI-driven generative approaches can transcend these limitations by directly proposing novel small molecules with drug-like properties and optimized ADMET profiles, thereby addressing the need for first-in-class antiviral candidates [74]. Several machine-learning paradigms have been applied in this context, including convolutional neural networks (CNNs) and recurrent neural networks (RNNs) for feature extraction, variational autoencoders (VAEs) and generative adversarial networks (GANs) for molecular generation, and reinforcement learning (RL) for reward-guided chemical optimization [75]. For example, Beck et al. employed a deep-learning-based drug-target interaction model (MT-DTI) using CNN and Transformer architectures to predict binding affinities between FDA-approved drugs and SARS-CoV-2 proteins [76,77]. Olivecrona et al. (2017) introduced a pioneering RL framework for de novo molecular design, where a SMILES-based RNN pretrained on existing compounds was fine-tuned via policy-gradient optimization using an augmented likelihood objective to balance exploration and chemical realism; this approach enabled generation of novel, synthesizable molecules optimized for desired properties such as activity and drug-likeness, establishing a key foundation for later RL-driven generative drug-design methods [15]. Similarly, Zhavoronkov et al. demonstrated a GAN-derived deep generative framework that produced novel 3CLpro inhibitors within weeks of model training [17]. Bung et al. further applied an RNN-based deep reinforcement learning (deep RL) approach to design candidate small molecules targeting the SARS-CoV-2 3CL protease, showing conceptual similarity to the generative process employed in this study [19]. However, while Bung et al. relied on transfer learning from the ChEMBL database of drug-like compounds [19] to retrain their model, our approach utilized Molnupiravir-like chemical scaffolds from the BIOVIA ACD database as seed structures to train a mGENTRL. In addition, their compound screening relied mainly on drug-likeness rules and physicochemical property filters, whereas our workflow integrates VAEs-RL in coupling with pharmacophore-based virtual screening and structure-based docking to refine candidate selection. Building upon these advances, our study integrates a mGENTRL with cheminformatics-guided screening to enable the de novo generation and prioritization of potential SARS-CoV-2 RdRp inhibitors. Unlike models focused solely on affinity prediction or generative exploration, our workflow combines de novo molecular generation, pharmacophore-based filtering, and structure-based docking, providing an end-to-end framework for both creation and rational triage of antiviral candidates.

3.4. Limitations and Future Directions

In this study, there were some limitations. First, the training of the similarity search and generative model relied on the ACD database, which inherently limits the chemical-space diversity of the generated candidates. Second, Molecule_36 has not yet been tested in vitro or in vivo and should therefore be regarded as a computationally generated scaffold rather than a finalized drug-like molecule. Additional chemical refinement and experimental validation will be required to determine its stability, mechanism of action, and antiviral efficacy.

4. Materials and Methods

4.1. Study Design and Process

This study is to develop potential anti-SARS-CoV-2 compounds using a generative deep neural networks (GDNN) combined with cheminformatics [16,78,79]. The molecule formula of Molnupiravir (C13H19N3O7) is served as a raw model of the training dataset in GDNN and is shown in Supplementary Figure S1. To obtain similar chemical structures of Molnupiravir, the database of BIOVIA Available Chemicals Directory version 2020.08 (BIOVIA ACD) was screened using the method of ligand-based similarity search in the BIOVIA Discovery Studio (DS) software (version 2022) [80].
After the establishment of the training dataset, a reinforcement learning model was used to generate small molecular structures. Subsequently, pharmacophore-based virtual screening of small molecules was performed to identify similar characteristics of Molnupiravir. We used molecular docking simulation with Molnupiravir as the positive control to select compounds from pharmacophore-based virtual screening to identify potential new compounds with higher affinity with SARS-CoV-2 RdRp.
Finally, ADMET analysis was applied to confirm that the candidates for small molecules were suitable. Patent information was checked using PubChem [81] and Reaxys [82]. The flowchart was shown in Figure 6.

4.2. Preprocessing of Data and Ligand-Based Similarity Search

Molnupiravir was used as the target for preliminary screening of its similar chemical structure based on the principle that structurally similar compounds have similar medicinal chemical properties [83]. The SMILES format of Molnupiravir was from the “Canonical SMILES” field in PubChem (Compound CID: 145996610). Furthermore, the database, BIOVIA ACD contains 126,550,570 drug-like small molecules and is currently one of the most abundant commercial compound collection databases in the world, including information on unique compounds from nearly 900 suppliers [80].
The relevant conditional process for ligand preprocessing using DS software was as follows: The ligand structure was repaired and prepared by the “Prepare Ligands” function. Then, the conformation that met the conditions for oral drugs was screened out by the “Filter by Lipinski and Veber Rules” function; finally, the ligand structure was optimized by the “Full Minimization” function to obtain the structure of a small molecule with the lowest energy.
To seek ligand similarity, we used the “Find Similar Molecules by Fingerprints” function in DS whose criteria referred to Pavadai et al., 2017 [84], to screen out small molecules with structures similar to Molnupiravir. For similarity calculation, we used the Tanimoto coefficient with a minimum similarity threshold of 0.75 and the FCFC4 algorithm for similarity search [84]. Sequentially, the filtered chemical structures were used as a training dataset in GDNN.

4.3. Establishment of the Deep Generative Model

The deep generative model (mGENTRL) was established from the modified GENTRL model which was based on the adjustment of the reward function, using Lipinski’s rule of five as the setting of the screening conditions [85]. The preprocessed chemical database was fed into the mGENTRL model for training, with the learning rate of the model parameter set to 0.0004. Please visit GitHub (modified GENTRL model) for more detailed code content. In addition, the “valid perc” parameter in the GENTRL model was used to calculate the percentage of valid SMILES in the newly generated SMILES string to evaluate the relative performance of the created mGENTRL model [86]. All computations of training were executed on PyTorch in Python 3.6.5 version through Anaconda3-5.2.0-Windows-x86_64 version and with the help of NVIDIA GeForce RTX 3070 graphics card.

4.4. Pharmacophore-Based Virtual Screening

The new compounds sampled by the mGENTRL model were screened further using a pharmacophore-based method based on Molnupiravir structure to sieve new small molecule compounds with similar chemical properties to Molnupiravir. First, a Molnupiravir-based pharmacophore model was generated by the “Common Feature Pharmacophore Generation” function, and the relevant conditional process for creating the pharmacophore model using DS software was as follows: “BEST” in “Conformation Generation” was selected to ensure optimal coverage of the conformational space. “Maximum Conformations” was set to 200 and “Energy Threshold” was set to 10 kcal/mol to ensure that a maximum of 200 conformations were generated for each small molecule to characterize its conformational space, among which only those within the energy threshold of 10 kcal/mol were retained. “Features” contained the hydrogen bond acceptor (HB_ACCEPTOR), hydrogen bond donor (HB_DONOR), hydrophobicity (HYDROPHOBIC), positive ion (POS_IONIZABLE), and aromatic ring center (RING_AROMATIC), and the rest of the parameter settings remained the default values.
Subsequently, the new small molecules sampled from the mGENTRL models were used to create a proprietary 3D chemical database by the “Build 3D Database” function, and the relevant conditional process for building a 3D database using DS software was as follows: “Number of Conformations” was set to 200 and “Conformation Method” was set to “BEST” to ensure the best coverage of 200 spatial conformations for each small molecule.
Finally, the pharmacophore-based virtual screening was performed on the 3D chemical database created above through the “Search 3D Database” function, where the “Search Method” was selected as “BEST” to identify structurally novel and potential small molecule compounds.

4.5. Molecular Docking

To further identify potent small molecule compounds against SARS-CoV-2, a model of molecular docking was used. First, SARS-CoV-2 RdRp, the 3D electron microscopic crystal structure from PDB ID: 7OZV [87], was served as a target protein. Preprocessed tasks included the removal of water molecules and heteroatoms present in this crystal structure. The active site coordinates of SARS-CoV-2 RdRp were referred to Aftab et al., 2020 [26]. Small molecule compounds filtered by pharmacophore-based virtual screening above were fed into a model of molecular docking. We performed molecular docking simulations with “Dock Ligands (CDOCKER)” [88], with the Pose Cluster Radius set to 0.5 to ensure the greatest possible diversity of docked conformations and all other parameters kept at default values. Finally, the criteria to identify candidate small molecules based on better binding affinity is set when compared to that of Molnupiravir and 7OZV.

4.6. Prediction of ADMET Properties

In this study, tools such as SwissADME [89], pkCSM [90], and DS software were used to predict and analyze the ADMET properties of small molecule compounds to confirm their potential to be drugs. The “ADMET Descriptors” function of the DS software predicts numerous pharmacological properties of small molecule compounds such as Aqueous solubility, blood–brain barrier penetration (BBB), Cytochrome P4502D6 inhibition, hepatotoxicity, human intestinal absorption, and plasma protein binding for selected small molecule compounds. The “Toxicity Prediction” function can also predict carcinogenicity, mutagenicity, skin sensitization, developmental toxicity potential, Rat Oral LD50, and biodegradability [91].

4.7. Analysis of Molecular Dynamics Simulations

Refined protein–ligand complexes were subjected to all-atom MD simulations using the CHARMM36 force field implemented in DS (version 2024). The system was solvated in a box with TIP3P water spanning 7 Å from any solute atom and neutralized with Na+/Cl ions to 0.15 M ionic concentration. Periodic boundary conditions were employed in all directions. Nonbonded interactions were calculated using a 12 Å cutoff with a switching function beginning at 10 Å, and long-range electrostatics were handled with the Particle Mesh Ewald method. Bond distances involving hydrogen atoms were restricted using the SHAKE algorithm, allowing a 2 fs integration time step. Temperature was set at 310 K via Langevin dynamics (damping coefficient: 1 ps−1), and pressure was controlled at 1 atm using the Langevin piston method. After the standard dynamics cascade, production MD simulations were continued in NAMD without positional restraints. Trajectories were recorded every 50 ps for subsequent analyses. Two independent molecular dynamics simulations were performed, with each run extending for 100 nanoseconds (ns).

5. Conclusions

In summary, this study explores a workflow that combines a deep generative neural network with cheminformatics-based screening to support the rapid identification of potential small-molecule inhibitors targeting SARS-CoV-2 RdRp. Among the generated compounds, Molecule_36 showed encouraging predicted binding interactions and ADMET characteristics, and no exact-structure patent records were found in an extended public patent search. While further synthesis and biological evaluation are required, this approach provides a feasible direction and a practical framework for accelerating the early discovery of new antiviral candidates.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/ijms262412017/s1.

Author Contributions

Conceptualization, C.-M.H., H.-Y.H. and M.-C.L. Methodology, S.-Y.L., C.-W.L. and M.-C.L. Draft writing, S.-Y.L. and M.-C.L. Manuscript revised, C.-M.H., H.-Y.H., C.-W.L. and M.-C.L. Funding acquisition, M.-C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by Ministry of National Defense Medical Affairs Bureau (MND-MAB-C09-112035 to M.-C.L.).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The code used in this study is openly available on GitHub at https://github.com/young19990726/mGENTRL-model (accessed on 10 October 2025). The molecular data used for model training was obtained from the BIOVIA Available Chemicals Directory (BIOVIA ACD) database, which is a commercial dataset. The authors are not permitted to redistribute the dataset.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Fisher, D.; Heymann, D. Q&A: The novel coronavirus outbreak causing COVID-19. BMC Med. 2020, 18, 57. [Google Scholar] [CrossRef]
  2. WHO. WHO Coronavirus (COVID-19) Dashboard. 2023. Available online: https://data.who.int/dashboards/covid19 (accessed on 31 March 2023).
  3. Murgolo, N.; Therien, A.G.; Howell, B.; Klein, D.; Koeplinger, K.; Lieberman, L.A.; Adam, G.C.; Flynn, J.; McKenna, P.; Swaminathan, G.; et al. SARS-CoV-2 tropism, entry, replication, and propagation: Considerations for drug discovery and development. PLoS Pathog. 2021, 17, e1009225. [Google Scholar] [CrossRef] [PubMed]
  4. Mahase, E. COVID-19: Pfizer’s paxlovid is 89% effective in patients at risk of serious illness, company reports. BMJ 2021, 375, n2713. [Google Scholar] [CrossRef]
  5. Mahase, E. COVID-19: UK becomes first country to authorise antiviral molnupiravir. BMJ 2021, 375, n2697. [Google Scholar] [CrossRef] [PubMed]
  6. DiMasi, J.A.; Grabowski, H.G.; Hansen, R.W. Innovation in the pharmaceutical industry: New estimates of R&D costs. J. Health Econ. 2016, 47, 20–33. [Google Scholar] [CrossRef] [PubMed]
  7. Mouchlis, V.D.; Afantitis, A.; Serra, A.; Fratello, M.; Papadiamantis, A.G.; Aidinis, V.; Lynch, I.; Greco, D.; Melagraki, G. Advances in de Novo Drug Design: From Conventional to Machine Learning Methods. Int. J. Mol. Sci. 2021, 22, 1676. [Google Scholar] [CrossRef]
  8. Paul, D.; Sanap, G.; Shenoy, S.; Kalyane, D.; Kalia, K.; Tekade, R.K. Artificial intelligence in drug discovery and development. Drug Discov. Today 2021, 26, 80–93. [Google Scholar] [CrossRef] [PubMed]
  9. Vatansever, S.; Schlessinger, A.; Wacker, D.; Kaniskan, H.; Jin, J.; Zhou, M.M.; Zhang, B. Artificial intelligence and machine learning-aided drug discovery in central nervous system diseases: State-of-the-arts and future directions. Med. Res. Rev. 2021, 41, 1427–1473. [Google Scholar] [CrossRef]
  10. Yang, X.; Wang, Y.; Byrne, R.; Schneider, G.; Yang, S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem. Rev. 2019, 119, 10520–10594. [Google Scholar] [CrossRef]
  11. Pirzada, R.H.; Javaid, N.; Choi, S. The Roles of the NLRP3 Inflammasome in Neurodegenerative and Metabolic Diseases and in Relevant Advanced Therapeutic Interventions. Genes 2020, 11, 131. [Google Scholar] [CrossRef] [PubMed]
  12. Gupta, R.; Srivastava, D.; Sahu, M.; Tiwari, S.; Ambasta, R.K.; Kumar, P. Artificial intelligence to deep learning: Machine intelligence approach for drug discovery. Mol. Divers. 2021, 25, 1315–1360. [Google Scholar] [CrossRef]
  13. Gómez-Bombarelli, R.; Wei, J.N.; Duvenaud, D.; Hernández-Lobato, J.M.; Sánchez-Lengeling, B.; Sheberla, D.; Aguilera-Iparraguirre, J.; Hirzel, T.D.; Adams, R.P.; Aspuru-Guzik, A. Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules. ACS Cent. Sci. 2018, 4, 268–276. [Google Scholar] [CrossRef]
  14. De Cao, N.K.; Kipf, T. MolGAN: An implicit generative model for small molecular graphs. arXiv 2022, arXiv:1805.11973. [Google Scholar] [CrossRef]
  15. Olivecrona, M.; Blaschke, T.; Engkvist, O.; Chen, H. Molecular de-novo design through deep reinforcement learning. J. Cheminform. 2017, 9, 48. [Google Scholar] [CrossRef]
  16. Zhavoronkov, A.; Ivanenkov, Y.A.; Aliper, A.; Veselov, M.S.; Aladinskiy, V.A.; Aladinskaya, A.V.; Terentiev, V.A.; Polykovskiy, D.A.; Kuznetsov, M.D.; Asadulaev, A. GENTRL (Deep learning enables rapid identification of potent DDR1 kinase inhibitors). Nat. Biotechnol. 2019, 37, 1038–1040. [Google Scholar] [CrossRef]
  17. Zhavoronkov, A.; Zagribelnyy, B.; Zhebrak, A.; Aladinskiy, V.; Terentiev, V.; Vanhaelen, Q.; Bezrukov, D.S.; Polykovskiy, D.; Shayakhmetov, R.; Filimonov, A. Potential non-covalent SARS-CoV-2 3C-like protease inhibitors designed using generative deep learning approaches and reviewed by human medicinal chemist in virtual reality. ChemRxiv 2020. [Google Scholar] [CrossRef]
  18. Mao, J.; Wang, J.; Zeb, A.; Cho, K.H.; Jin, H.; Kim, J.; Lee, O.; Wang, Y.; No, K.T. Transformer-Based Molecular Generative Model for Antiviral Drug Design. J. Chem. Inf. Model 2024, 64, 2733–2745. [Google Scholar] [CrossRef] [PubMed]
  19. Bung, N.; Krishnan, S.R.; Bulusu, G.; Roy, A. De novo design of new chemical entities for SARS-CoV-2 using artificial intelligence. Future Med. Chem. 2021, 13, 575–585. [Google Scholar] [CrossRef] [PubMed]
  20. Andrianov, A.M.; Shuldau, M.A.; Furs, K.V.; Yushkevich, A.M.; Tuzikov, A.V. AI-Driven De Novo Design and Molecular Modeling for Discovery of Small-Molecule Compounds as Potential Drug Candidates Targeting SARS-CoV-2 Main Protease. Int. J. Mol. Sci. 2023, 24, 8083. [Google Scholar] [CrossRef] [PubMed]
  21. Li, S.; Wang, L.; Meng, J.; Zhao, Q.; Zhang, L.; Liu, H. De Novo design of potential inhibitors against SARS-CoV-2 Mpro. Comput. Biol. Med. 2022, 147, 105728. [Google Scholar] [CrossRef] [PubMed]
  22. Arshia, A.H.; Shadravan, S.; Solhjoo, A.; Sakhteman, A.; Sami, A. De novo design of novel protease inhibitor candidates in the treatment of SARS-CoV-2 using deep learning, docking, and molecular dynamic simulations. Comput. Biol. Med. 2021, 139, 104967. [Google Scholar] [CrossRef] [PubMed]
  23. Niranjan, V.; Uttarkar, A.; Ramakrishnan, A.; Muralidharan, A.; Shashidhara, A.; Acharya, A.; Tarani, A.; Kumar, J. De Novo Design of Anti-COVID Drugs Using Machine Learning-Based Equivariant Diffusion Model Targeting the Spike Protein. Curr. Issues Mol. Biol. 2023, 45, 4261–4284. [Google Scholar] [CrossRef]
  24. Khan, A.; Bhrdwaj, A.; Sharma, K.; Arugonda, R.; Kaur, N.; Chaudhary, R.; Shaheen, U.; Panwar, U.; Natchimuthu, V.; Kumar, A.; et al. Potential Inhibitors of SARS-CoV-2 Developed through Machine Learning, Molecular Docking, and MD Simulation. Med. Chem. 2025; Online ahead of print. [Google Scholar] [CrossRef]
  25. Gurung, A.B.; Ali, M.A.; Lee, J.; Farah, M.A.; Al-Anazi, K.M. An Updated Review of Computer-Aided Drug Design and Its Application to COVID-19. Biomed. Res. Int. 2021, 2021, 8853056. [Google Scholar] [CrossRef] [PubMed]
  26. Aftab, S.O.; Ghouri, M.Z.; Masood, M.U.; Haider, Z.; Khan, Z.; Ahmad, A.; Munawar, N. Analysis of SARS-CoV-2 RNA-dependent RNA polymerase as a potential therapeutic drug target using a computational approach. J. Transl. Med. 2020, 18, 275. [Google Scholar] [CrossRef]
  27. Kulabaş, N.; Yeşil, T.; Küçükgüzel, I. Evaluation of molnupiravir analogues as novel coronavirus (SARS-CoV-2) RNA-dependent RNA polymerase (RdRp) inhibitors—An in silico docking and admet simulation study. J. Res. Pharm. 2021, 25, 967–981. [Google Scholar] [CrossRef]
  28. Cheng, A.; Merz, K.M., Jr. Prediction of aqueous solubility of a diverse set of compounds using quantitative structure-property relationships. J. Med. Chem. 2003, 46, 3572–3580. [Google Scholar] [CrossRef]
  29. Egan, W.J.; Merz, K.M., Jr.; Baldwin, J.J. Prediction of drug absorption using multivariate statistics. J. Med. Chem. 2000, 43, 3867–3877. [Google Scholar] [CrossRef]
  30. Hubatsch, I.; Ragnarsson, E.G.; Artursson, P. Determination of drug permeability and prediction of drug absorption in Caco-2 monolayers. Nat. Protoc. 2007, 2, 2111–2119. [Google Scholar] [CrossRef]
  31. Nguyen, T.-T.-L.; Duong, V.-A.; Maeng, H.-J. Pharmaceutical Formulations with P-Glycoprotein Inhibitory Effect as Promising Approaches for Enhancing Oral Drug Absorption and Bioavailability. Pharmaceutics 2021, 13, 1103. [Google Scholar] [CrossRef]
  32. Martin, Y.C. A bioavailability score. J. Med. Chem. 2005, 48, 3164–3170. [Google Scholar] [CrossRef] [PubMed]
  33. Smith, D.A.; Beaumont, K.; Maurer, T.S.; Di, L. Volume of distribution in drug design: Miniperspective. J. Med. Chem. 2015, 58, 5691–5698. [Google Scholar] [CrossRef] [PubMed]
  34. van de Waterbeemd, H.; Gifford, E. ADMET in silico modelling: Towards prediction paradise? Nat. Rev. Drug Discov. 2003, 2, 192–204. [Google Scholar] [CrossRef] [PubMed]
  35. Di, L. The role of drug metabolizing enzymes in clearance. Expert. Opin. Drug Metab. Toxicol. 2014, 10, 379–393. [Google Scholar] [CrossRef] [PubMed]
  36. Susnow, R.G.; Dixon, S.L. Use of robust classification techniques for the prediction of human cytochrome P450 2D6 inhibition. J. Chem. Inf. Comput. Sci. 2003, 43, 1308–1315. [Google Scholar] [CrossRef] [PubMed]
  37. Lynch, T.; Price, A. The effect of cytochrome P450 metabolism on drug response, interactions, and adverse effects. Am. Fam. Physician 2007, 76, 391–396. [Google Scholar] [PubMed]
  38. Yap, C.W.; Li, Z.R.; Chen, Y.Z. Quantitative structure-pharmacokinetic relationships for drug clearance by using statistical learning methods. J. Mol. Graph. Model 2006, 24, 383–395. [Google Scholar] [CrossRef]
  39. Koepsell, H. Organic Cation Transporters in Health and Disease. Pharmacol. Rev. 2020, 72, 253–319. [Google Scholar] [CrossRef]
  40. Parra, A.L.; Yhebra, R.S.; Sardiñas, I.G.; Buela, L.I. Comparative study of the assay of Artemia salina L. and the estimate of the medium lethal dose (LD50 value) in mice, to determine oral acute toxicity of plant extracts. Phytomedicine 2001, 8, 395–400. [Google Scholar]
  41. Garrido, A.; Lepailleur, A.; Mignani, S.M.; Dallemagne, P.; Rochais, C. hERG toxicity assessment: Useful guidelines for drug design. Eur. J. Med. Chem. 2020, 195, 112290. [Google Scholar] [CrossRef] [PubMed]
  42. Ertl, P.; Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminform. 2009, 1, 8. [Google Scholar] [CrossRef] [PubMed]
  43. Faisal, S.; Badshah, S.L.; Kubra, B.; Sharaf, M.; Emwas, A.H.; Jaremko, M.; Abdalla, M. Computational Study of SARS-CoV-2 RNA Dependent RNA Polymerase Allosteric Site Inhibition. Molecules 2021, 27, 223. [Google Scholar] [CrossRef]
  44. Tang, W.F.; Tsai, H.P.; Chin, Y.F.; Tsai, S.K.; Lin, C.C.; Ngo, S.T.; Liang, P.H.; Jheng, J.R.; Hsieh, C.F.; Lee, J.C.; et al. Targeting SARS-CoV-2 RNA-dependent RNA polymerase with the coumarin derivative BPR2-D2: Evidence from cell-based and enzymatic studies. Biomed. Pharmacother. 2025, 189, 118252. [Google Scholar] [CrossRef]
  45. Hillen, H.S.; Kokic, G.; Farnung, L.; Dienemann, C.; Tegunov, D.; Cramer, P. Structure of replicating SARS-CoV-2 polymerase. Nature 2020, 584, 154–156. [Google Scholar] [CrossRef]
  46. Zhu, W.; Chen, C.Z.; Gorshkov, K.; Xu, M.; Lo, D.C.; Zheng, W. RNA-Dependent RNA Polymerase as a Target for COVID-19 Drug Discovery. SLAS Discov. 2020, 25, 1141–1151. [Google Scholar] [CrossRef] [PubMed]
  47. Ahmad, M.; Dwivedy, A.; Mariadasse, R.; Tiwari, S.; Kar, D.; Jeyakanthan, J.; Biswal, B.K. Prediction of Small Molecule Inhibitors Targeting the Severe Acute Respiratory Syndrome Coronavirus-2 RNA-dependent RNA Polymerase. ACS Omega 2020, 5, 18356–18366. [Google Scholar] [CrossRef] [PubMed]
  48. Ahmad, J.; Ikram, S.; Ahmad, F.; Rehman, I.U.; Mushtaq, M. SARS-CoV-2 RNA Dependent RNA polymerase (RdRp)—A drug repurposing study. Heliyon 2020, 6, e04502. [Google Scholar] [CrossRef] [PubMed]
  49. Mishra, A.; Rathore, A.S. RNA dependent RNA polymerase (RdRp) as a drug target for SARS-CoV2. J. Biomol. Struct. Dyn. 2022, 40, 6039–6051. [Google Scholar] [CrossRef]
  50. Gao, Y.; Yan, L.; Huang, Y.; Liu, F.; Zhao, Y.; Cao, L.; Wang, T.; Sun, Q.; Ming, Z.; Zhang, L.; et al. Structure of the RNA-dependent RNA polymerase from COVID-19 virus. Science 2020, 368, 779–782. [Google Scholar] [CrossRef]
  51. Shannon, A.; Le, N.T.; Selisko, B.; Eydoux, C.; Alvarez, K.; Guillemot, J.C.; Decroly, E.; Peersen, O.; Ferron, F.; Canard, B. Remdesivir and SARS-CoV-2: Structural requirements at both nsp12 RdRp and nsp14 Exonuclease active-sites. Antiviral. Res. 2020, 178, 104793. [Google Scholar] [CrossRef]
  52. Latosińska, M.; Latosińska, J.N. Favipiravir Analogues as Inhibitors of SARS-CoV-2 RNA-Dependent RNA Polymerase, Combined Quantum Chemical Modeling, Quantitative Structure-Property Relationship, and Molecular Docking Study. Molecules 2024, 29, 441. [Google Scholar] [CrossRef]
  53. Ebrahimi, K.S.; Ansari, M.; Hosseyni Moghaddam, M.S.; Ebrahimi, Z.; Salehi, Z.; Shahlaei, M.; Moradi, S. In silico investigation on the inhibitory effect of fungal secondary metabolites on RNA dependent RNA polymerase of SARS-CoV-II: A docking and molecular dynamic simulation study. Comput. Biol. Med. 2021, 135, 104613. [Google Scholar] [CrossRef]
  54. Ahmed, S.; Mahtarin, R.; Ahmed, S.S.; Akter, S.; Islam, M.S.; Mamun, A.A.; Islam, R.; Hossain, M.N.; Ali, M.A.; Sultana, M.U.C.; et al. Investigating the binding affinity, interaction, and structure-activity-relationship of 76 prescription antiviral drugs targeting RdRp and Mpro of SARS-CoV-2. J. Biomol. Struct. Dyn. 2021, 39, 6290–6305. [Google Scholar] [CrossRef]
  55. Goswami, D. Comparative assessment of RNA-dependent RNA polymerase (RdRp) inhibitors under clinical trials to control SARS-CoV2 using rigorous computational workflow. RSC Adv. 2021, 11, 29015–29028. [Google Scholar] [CrossRef]
  56. Frontera, A.; Quinonero, D.; Deya, P.M. Cation–π and anion–π interactions. WIREs Comput. Mol. Sci. 2011, 1, 440–459. [Google Scholar] [CrossRef]
  57. Kuzniak-Glanowska, E.; Glanowski, M.; Kurczab, R.; Bojarski, A.J.; Podgajny, R. Mining anion-aromatic interactions in the Protein Data Bank. Chem. Sci. 2022, 13, 3984–3998. [Google Scholar] [CrossRef] [PubMed]
  58. Abdulhameed Odhar, H.; Fadhil Hashim, A.; Sami Humad, S. Molecular docking analysis and dynamics simulation of salbutamol with the monoamine oxidase B (MAO-B) enzyme. Bioinformation 2022, 18, 304–309. [Google Scholar] [CrossRef] [PubMed]
  59. Voitsitskyi, T.; Bdzhola, V.; Stratiichuk, R.; Koleiev, I.; Ostrovsky, Z.; Vozniak, V.; Khropachov, I.; Henitsoi, P.; Popryho, L.; Zhytar, R.; et al. Augmenting a training dataset of the generative diffusion model for molecular docking with artificial binding pockets. RSC Adv. 2024, 14, 1341–1353. [Google Scholar] [CrossRef]
  60. Khan, F.I.; Kang, T.; Ali, H.; Lai, D. Remdesivir Strongly Binds to RNA-Dependent RNA Polymerase, Membrane Protein, and Main Protease of SARS-CoV-2: Indication From Molecular Modeling and Simulations. Front. Pharmacol. 2021, 12, 710778, https://doi.org/10.3389/fphar.2021.710778. Erratum in Front. Pharmacol. 2022, 13, 1027099. [Google Scholar] [CrossRef] [PubMed]
  61. Zhou, H.X.; Pang, X. Electrostatic Interactions in Protein Structure, Folding, Binding, and Condensation. Chem. Rev. 2018, 118, 1691–1741. [Google Scholar] [CrossRef]
  62. Kokic, G.; Hillen, H.S.; Tegunov, D.; Dienemann, C.; Seitz, F.; Schmitzova, J.; Farnung, L.; Siewert, A.; Höbartner, C.; Cramer, P. Mechanism of SARS-CoV-2 polymerase stalling by remdesivir. Nat. Commun. 2021, 12, 279. [Google Scholar] [CrossRef]
  63. Itoh, Y.; Nakashima, Y.; Tsukamoto, S.; Kurohara, T.; Suzuki, M.; Sakae, Y.; Oda, M.; Okamoto, Y.; Suzuki, T. N(+)-C-H···O Hydrogen bonds in protein-ligand complexes. Sci. Rep. 2019, 9, 767. [Google Scholar] [CrossRef] [PubMed]
  64. Ferreira de Freitas, R.; Schapira, M. A systematic analysis of atomic protein-ligand interactions in the PDB. Medchemcomm 2017, 8, 1970–1981. [Google Scholar] [CrossRef]
  65. Park, G.J.; Osinski, A.; Hernandez, G.; Eitson, J.L.; Majumdar, A.; Tonelli, M.; Henzler-Wildman, K.; Pawłowski, K.; Chen, Z.; Li, Y.; et al. The mechanism of RNA capping by SARS-CoV-2. Nature 2022, 609, 793–800. [Google Scholar] [CrossRef]
  66. Koulgi, S.; Jani, V.; Uppuladinne, V.N.M.; Sonavane, U.; Joshi, R. Natural plant products as potential inhibitors of RNA dependent RNA polymerase of Severe Acute Respiratory Syndrome Coronavirus-2. PLoS ONE 2021, 16, e0251801. [Google Scholar] [CrossRef]
  67. Elfiky, A.A. SARS-CoV-2 RNA dependent RNA polymerase (RdRp) targeting: An in silico perspective. J. Biomol. Struct. Dyn. 2021, 39, 3204–3212. [Google Scholar] [CrossRef]
  68. Brunt, D.; Lakernick, P.M.; Wu, C. Discovering new potential inhibitors to SARS-CoV-2 RNA dependent RNA polymerase (RdRp) using high throughput virtual screening and molecular dynamics simulations. Sci. Rep. 2022, 12, 19986. [Google Scholar] [CrossRef] [PubMed]
  69. Sarkhel, S.; Desiraju, G.R. N-H...O, O-H...O, and C-H...O hydrogen bonds in protein-ligand complexes: Strong and weak interactions in molecular recognition. Proteins 2004, 54, 247–259. [Google Scholar] [CrossRef] [PubMed]
  70. Naydenova, K.; Muir, K.W.; Wu, L.F.; Zhang, Z.; Coscia, F.; Peet, M.J.; Castro-Hartmann, P.; Qian, P.; Sader, K.; Dent, K.; et al. Structure of the SARS-CoV-2 RNA-dependent RNA polymerase in the presence of favipiravir-RTP. Proc. Natl. Acad. Sci. USA 2021, 118, e2021946118. [Google Scholar] [CrossRef] [PubMed]
  71. Ocana, A.; Pandiella, A.; Privat, C.; Bravo, I.; Luengo-Oroz, M.; Amir, E.; Gyorffy, B. Integrating artificial intelligence in drug discovery and early drug development: A transformative approach. Biomark. Res. 2025, 13, 45. [Google Scholar] [CrossRef]
  72. Elfiky, A.A. Ribavirin, Remdesivir, Sofosbuvir, Galidesivir, and Tenofovir against SARS-CoV-2 RNA dependent RNA polymerase (RdRp): A molecular docking study. Life Sci. 2020, 253, 117592. [Google Scholar] [CrossRef]
  73. Vicenti, I.; Zazzi, M.; Saladini, F. SARS-CoV-2 RNA-dependent RNA polymerase as a therapeutic target for COVID-19. Expert. Opin. Ther. Pat. 2021, 31, 325–337. [Google Scholar] [CrossRef] [PubMed]
  74. Prasad, K.; Kumar, V. Artificial intelligence-driven drug repurposing and structural biology for SARS-CoV-2. Curr. Res. Pharmacol. Drug Discov. 2021, 2, 100042. [Google Scholar] [CrossRef] [PubMed]
  75. Piccialli, F.; di Cola, V.S.; Giampaolo, F.; Cuomo, S. The Role of Artificial Intelligence in Fighting the COVID-19 Pandemic. Inf. Syst. Front. 2021, 23, 1467–1497. [Google Scholar] [CrossRef]
  76. Shin, B.; Park, S.; Kang, K.; Ho, J.C. Self-attention based molecule representation for predicting drug-target interaction. In Proceedings of the Machine Learning for Healthcare Conference, Ann Arbor, MI, USA, 8–10 August 2019; PMLR: Cambridge, MA, USA, 2019. [Google Scholar]
  77. Beck, B.R.; Shin, B.; Choi, Y.; Park, S.; Kang, K. Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model. Comput. Struct. Biotechnol. J. 2020, 18, 784–790. [Google Scholar] [CrossRef] [PubMed]
  78. Schaller, D.; Šribar, D.; Noonan, T.; Deng, L.; Nguyen, T.N.; Pach, S.; Machalz, D.; Bermudez, M.; Wolber, G. Next generation 3D pharmacophore modeling. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2020, 10, e1468. [Google Scholar] [CrossRef]
  79. Maia, E.H.B.; Assis, L.C.; De Oliveira, T.A.; Da Silva, A.M.; Taranto, A.G. Structure-based virtual screening: From classical to artificial intelligence. Front. Chem. 2020, 8, 343. [Google Scholar] [CrossRef]
  80. BIOVIA. Discovery Studio Software Version. 2022. Available online: https://www.3ds.com/products/biovia (accessed on 12 August 2025).
  81. NIH. PubChem. 2023. Available online: https://pubchem.ncbi.nlm.nih.gov (accessed on 19 February 2023).
  82. Goodman, J. Computer Software Review: Reaxys. J. Chem. Inf. Model. 2009, 49, 2897–2898. [Google Scholar] [CrossRef]
  83. O’Boyle, N.M.; Sayle, R.A. Comparing structural fingerprints using a literature-based similarity benchmark. J. Cheminform. 2016, 8, 36. [Google Scholar] [CrossRef] [PubMed]
  84. Pavadai, E.; Kaur, G.; Wittlin, S.; Chibale, K. Identification of steroid-like natural products as antiplasmodial agents by 2D and 3D similarity-based virtual screening. Medchemcomm 2017, 8, 1152–1157. [Google Scholar] [CrossRef] [PubMed]
  85. Lipinski, C.A. Lead-and drug-like compounds: The rule-of-five revolution. Drug Discov. Today Technol. 2004, 1, 337–341. [Google Scholar] [CrossRef] [PubMed]
  86. Yassine, R.; Makrem, M.; Farhat, F. Active Learning and the Potential of Neural Networks Accelerate Molecular Screening for the Design of a New Molecule Effective against SARS-CoV-2. Biomed. Res. Int. 2021, 2021, 6696012. [Google Scholar] [CrossRef]
  87. PDB, R. RCSB Protein Data Bank. 2022. Available online: https://www.rcsb.org/structure/7OZV (accessed on 3 September 2022).
  88. Wu, G.; Robertson, D.H.; Brooks, C.L., III; Vieth, M. Detailed analysis of grid-based molecular docking: A case study of CDOCKER—A CHARMm-based MD docking algorithm. J. Comput. Chem. 2003, 24, 1549–1562. [Google Scholar] [CrossRef] [PubMed]
  89. Daina, A.; Michielin, O.; Zoete, V. SwissADME: A free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci. Rep. 2017, 7, 42717. [Google Scholar] [CrossRef] [PubMed]
  90. Pires, D.E.; Blundell, T.L.; Ascher, D.B. pkCSM: Predicting Small-Molecule Pharmacokinetic and Toxicity Properties Using Graph-Based Signatures. J. Med. Chem. 2015, 58, 4066–4072. [Google Scholar] [CrossRef] [PubMed]
  91. Pal, S.; Kumar, V.; Kundu, B.; Bhattacharya, D.; Preethy, N.; Reddy, M.P.; Talukdar, A. Ligand-based Pharmacophore Modeling, Virtual Screening and Molecular Docking Studies for Discovery of Potential Topoisomerase I Inhibitors. Comput. Struct. Biotechnol. J. 2019, 17, 291–310. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Examples of newly generated compounds by mGENTRL model. Some examples of mGENTRL model products.
Figure 1. Examples of newly generated compounds by mGENTRL model. Some examples of mGENTRL model products.
Ijms 26 12017 g001
Figure 2. A pharmacophore model based on Molnupiravir was generated using DS software. The chemical properties of Molnupiravir were identified as shown in the Hydrogen bond acceptor, Hydrophobic, and Hydrogen bond donor.
Figure 2. A pharmacophore model based on Molnupiravir was generated using DS software. The chemical properties of Molnupiravir were identified as shown in the Hydrogen bond acceptor, Hydrophobic, and Hydrogen bond donor.
Ijms 26 12017 g002
Figure 3. The 2D structural chemical formula of Molecule_36.
Figure 3. The 2D structural chemical formula of Molecule_36.
Ijms 26 12017 g003
Figure 4. Molecular docking simulation of Molnupiravir and Molecule_36 with SARS-CoV-2 RdRp. (A) 2D interaction plot of Molnupiravir with SARS-CoV-2 RdRp. (B) Binding orientation of Molnupiravir at the active site of SARS-CoV-2 RdRp. (C) 2D interaction plot of Molecule_36 with SARS-CoV-2 RdRp. (D) Binding orientation of Molecule_36 at the active site of SARS-CoV-2 RdRp.
Figure 4. Molecular docking simulation of Molnupiravir and Molecule_36 with SARS-CoV-2 RdRp. (A) 2D interaction plot of Molnupiravir with SARS-CoV-2 RdRp. (B) Binding orientation of Molnupiravir at the active site of SARS-CoV-2 RdRp. (C) 2D interaction plot of Molecule_36 with SARS-CoV-2 RdRp. (D) Binding orientation of Molecule_36 at the active site of SARS-CoV-2 RdRp.
Ijms 26 12017 g004
Figure 5. The root mean square deviation of 7OZV (RdRp) in complexes with Molecule_36.
Figure 5. The root mean square deviation of 7OZV (RdRp) in complexes with Molecule_36.
Ijms 26 12017 g005
Figure 6. Flowchart. Tc: Tanimoto Coefficient. FCFC_4: Count vector form of FCFP4. mGENTRL: Modified generative tensorial reinforcement learning. SARS-CoV-2: Severe Acute Respiratory Syndrome Coronavirus 2. RdRp: RNA-dependent RNA polymerase. ADMET: Absorption, distribution, metabolism, excretion, and toxicity.
Figure 6. Flowchart. Tc: Tanimoto Coefficient. FCFC_4: Count vector form of FCFP4. mGENTRL: Modified generative tensorial reinforcement learning. SARS-CoV-2: Severe Acute Respiratory Syndrome Coronavirus 2. RdRp: RNA-dependent RNA polymerase. ADMET: Absorption, distribution, metabolism, excretion, and toxicity.
Ijms 26 12017 g006
Table 1. Comparison of potential new small molecule compounds with Molnupiravir in the prediction of ADMET properties.
Table 1. Comparison of potential new small molecule compounds with Molnupiravir in the prediction of ADMET properties.
ParametersMolecule_36Molnupiravir
AbsorptionAqueous solubility c−1.032−0.894
Human intestinal absorption a,cGoodLow
Caco-2 permeability b (log cm/s)1.0820.531
P-glycoprotein substrate a,bNoNo
Bioavailibility score a0.560.55
DistributionVDss b (human, (log L/kg))−0.3790.581
BBB permeability b (log BB)−0.67−1.057
CNS permeability b (log PS)−3.143−3.761
MetabolismCYP1A2 inhibitior a,bNoNo
CYP2C19 inhibitior a,bNoNo
CYP2C9 inhibitior a,bNoNo
CYP2D6 inhibitior a,b,cNoNo
CYP3A4 inhibitior a,bNoNo
ExcretionTotal clearance b (log ml/min/kg)0.6710.203
Renal OCT2 substrate bNoNo
ToxicityDevelopmental Toxicity Potential cNoYes
Oral Rat LD50 c6.583.03
Carcinogenicity cNoNo
Mutagenicity cNoNo
Hepatotoxicity b,cNoYes
Cardio-toxicityhERG I inhibitor bNoNo
hERG II inhibitor bNoNo
Skin sensitization b,cNoNo
Biodegradability cYesYes
a The result of using the online service Swissadme; b the result of using the online service pkCSM; c the result of using the Discovery Studio software (version 2022).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, S.-Y.; Hung, C.-M.; Hung, H.-Y.; Lai, C.-W.; Lee, M.-C. Designing Novel Compound Candidates Against SARS-CoV-2 Using Generative Deep Neural Networks and Cheminformatics. Int. J. Mol. Sci. 2025, 26, 12017. https://doi.org/10.3390/ijms262412017

AMA Style

Li S-Y, Hung C-M, Hung H-Y, Lai C-W, Lee M-C. Designing Novel Compound Candidates Against SARS-CoV-2 Using Generative Deep Neural Networks and Cheminformatics. International Journal of Molecular Sciences. 2025; 26(24):12017. https://doi.org/10.3390/ijms262412017

Chicago/Turabian Style

Li, Shang-Yang, Chin-Mao Hung, Hsin-Yi Hung, Chih-Wei Lai, and Meng-Chang Lee. 2025. "Designing Novel Compound Candidates Against SARS-CoV-2 Using Generative Deep Neural Networks and Cheminformatics" International Journal of Molecular Sciences 26, no. 24: 12017. https://doi.org/10.3390/ijms262412017

APA Style

Li, S.-Y., Hung, C.-M., Hung, H.-Y., Lai, C.-W., & Lee, M.-C. (2025). Designing Novel Compound Candidates Against SARS-CoV-2 Using Generative Deep Neural Networks and Cheminformatics. International Journal of Molecular Sciences, 26(24), 12017. https://doi.org/10.3390/ijms262412017

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop