Next Article in Journal
Persistent Mu-Opioid Receptor Dysregulation in a Pain-Facilitatory Brain Region Reinstates Hyperalgesia After Resolution of Opioid-Induced Hyperalgesia
Previous Article in Journal
Integrated Evaluation of Urtica dioica Extract Assessing Physiochemical Analysis with Antioxidant, Antiviral, and Immunomodulatory Effects Against SARS-CoV-2
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Integrative Computational Pipeline for CK2 Inhibitor Discovery in Triple-Negative Breast Cancer Using Virtual Screening, Molecular Dynamics, Machine Learning, and Density Functional Theory

1
Department of Pharmaceutical Sciences, College of Pharmacy, QU Health, Qatar University, Doha P.O. Box 2713, Qatar
2
Department of Medical Laboratories, College of Applied Medical Sciences, Qassim University, Buraydah 51452, Saudi Arabia
3
Precision Health Analysis Unit, Translational Research, Dasman Diabetes Institute, Dasman 15462, Kuwait
4
Department of Chemistry and Earth Sciences, College of Arts and Science, Qatar University, Doha P.O. Box 2713, Qatar
5
Surgical Research Section, Department of Surgery, Hamad Medical Corporation, Doha P.O. Box 3050, Qatar
6
Department of Biomedical Sciences, College of Health Sciences, QU Health, Qatar University, Doha P.O. Box 2713, Qatar
7
Department of Chemistry, Jordan University of Science and Technology, Irbid 22110, Jordan
8
School of Pharmacy, Sunway University, Sunway City 47500, Malaysia
*
Authors to whom correspondence should be addressed.
Pharmaceuticals 2026, 19(5), 694; https://doi.org/10.3390/ph19050694
Submission received: 17 March 2026 / Revised: 11 April 2026 / Accepted: 17 April 2026 / Published: 28 April 2026
(This article belongs to the Special Issue Cancer Therapeutics: Drug Repurposing and Computational Strategies)

Abstract

Background: Triple-negative breast cancer (TNBC) remains among the most aggressive and therapeutically unresponsive subtypes due to the absence of ER, PR, and HER2 targets. Casein Kinase II (CK2), a pleiotropic serine/threonine kinase overexpressed in TNBC, represents a compelling target for rational drug design. Methods: Here, we present an AI-integrated benchmarking framework combining virtual drug discovery, molecular dynamics simulations, machine learning-driven QSAR modeling, and quantum-mechanical electronic structure analysis to identify potent CK2 inhibitors from natural product chemical space. Results: A validated XP docking protocol (ROC–AUC = 0.748) screened ~480,000 compounds, yielding seven hits, with superior binding to the reference inhibitor CX-4945. Among these, Anastatin B, 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one, Rhein, and aloe emodin acetate exhibited highly favorable docking scores (−11.6 to −13.1 kcal mol−1) and stable 200 ns binding dynamics, reflected by RMSD ≤ 2 Å and compact Rg trajectories. MM-PBSA/MM-GBSA analyses confirmed robust thermodynamic stability, while DFT-derived HOMO–LUMO gaps (3.8–4.3 eV) suggested optimal electronic reactivity for kinase inhibition. Machine learning QSAR models demonstrated strong predictive performance, with the best stacking models achieving test R2 ≈ 0.69 and consistent cross-validation performance (CV R2 ≈ 0.66–0.69), supporting reliable prediction of pIC50 values and prioritization of top-ranked scaffolds. Conclusions: Collectively, this integrative framework bridges AI-based learning and biophysical validation, establishing a reproducible paradigm for de novo CK2 inhibitor discovery in TNBC.

1. Introduction

Breast cancer is the second most common cancer worldwide, according to the World Health Organization (WHO) [1]. Breast cancer represents the most prevalent malignancy among women and the leading cause of death from cancer in females worldwide [2,3]. There were 2,261,419 new cases of breast cancer worldwide in 2020 (11.7% of all cancers) according to recent data from the Global Cancer Observatory of WHO. Breast cancer caused 6.9% of cancer deaths worldwide in 2020. The incidence and mortality of breast cancer are the highest among other cancers and have been on the rise in recent years, with a higher prevalence in older women. In Qatar, according to recent data from the Global Cancer Observatory in 2020, breast cancer was by far the most common cancer among both sexes, with 14.7% of new cancer cases. In women, breast cancer is also the most common, representing 37.7% of all cancers. It is ranked as the third cause of cancer mortality in Qatar, with 57 (8.1%) deaths after lung (10.9%) and leukemia (8.5%) when both sexes are considered.
Among breast cancer subtypes, triple-negative breast cancer (TNBC) is one of the most aggressive. It accounts for 10–20% of all breast cancer cases [4,5]. TNBCs are defined as breast cancers that do not express human epidermal growth factor receptor 2 (HER2), estrogen receptor (ER), and progesterone receptor (PR) [6]. These cancers are associated with advanced and higher tumor grades, distinct metastasis, and frequent BRCA1 mutations [4]. The lack of hormone receptors and HER2 expression represents a clinical challenge, as this subtype of breast cancer does not respond to any of the currently available treatments. Therefore, TNBC patients have a tremendously poor prognosis, do not respond satisfactorily to current therapies, and have a remarkably higher relapse rate and disease recurrence compared with non-TNBCs. Currently, there are no approved US Food and Drug Administration (FDA)-targeted medications to manage and treat patients diagnosed with TNBCs. This situation has urged many researchers to address the heterogeneity of TNBC through the identification of novel molecular targets and potential therapeutic options.
CK2 is a serine/threonine-protein kinase that controls key intracellular signaling hubs and is involved in the pathogenesis of many diseases [7]. Emerging evidence suggests that CK2 may be a key molecular effector in breast cancer. CK2 is overexpressed in breast cancer cells, and its activity is associated with increased cell proliferation, migration, and invasion. In contrast, inhibition of CK2’s activity was shown to reduce breast cancer survival. Hence, CK2 has received considerable interest as a therapeutic target in cancer drug design and discovery [7]. In this context, the design and synthesis of bioactive small molecules to inhibit CK2 has gained substantial interest [8]. CK2 has potential as a therapeutic target for cancer as it plays a crucial role in multiple cellular processes, especially for TNBC. Studies have shown that CK2 is often overexpressed and hyperactivated in TNBC, contributing to its aggressive behavior and resistance to standard treatments. Preclinical studies using CK2 inhibitors have shown promising results, including inhibition of tumor growth, induction of cell death, and sensitization of TNBC cells to chemotherapy and radiation therapy. Several CK2 inhibitors reached preclinical and clinical development as potential therapeutic agents. Silmitasertib (CX-4945) is an orally administered CK2 inhibitor that has shown promising results in preclinical studies [9,10,11]. It has demonstrated anti-tumor activity in various cancer types, including breast, prostate, and lung cancers. CX-4945 has undergone Phase I and Phase II clinical trials. In addition, 4,5,6,7-tetrabromobenzotriazole (TBB, NSC 231634) is another CK2 inhibitor, which has been extensively studied in preclinical models and has exhibited potent anti-cancer effects by inhibiting CK2 activity [12]. Quinalizarin (CX-4945) and Ellagic acid (coumarin derivative) are other CK2 inhibitors that have shown inhibitory effects for CK2 and demonstrated potential anti-cancer properties in preclinical studies [13,14]. More potent and innovative therapeutic molecules are required to effectively inhibit this protein in various cancers and overcome the challenge of drug resistance. For instance, computational methods are highly effective in identifying promising hits for clinical applications across diverse diseases [15]. In this context, kinase-driven pathways such as CK2 represent attractive intervention points, and AI-guided drug discovery approaches offer a rational framework to identify potent and selective inhibitors. Leveraging an integrated suite of state-of-the-art computational methodologies, this study combines extra-precision chemical space exploration, long-timescale physics-based simulations, binding free energy calculations, quantum-mechanical analysis, and machine learning-based structure–activity modeling to uncover novel CK2 inhibitors. These approaches collectively enable the identification of potent, selective, and therapeutically safe candidates with strong inhibitory potential against CK2.

2. Results and Discussion

2.1. Structural Retrieval and Benchmarking

Casein Kinase 2 is a viable drug target for the discovery of drugs that could inhibit cancer cells in many types of cancers. The CK2 kinase comprises two major regions, i.e., 34–41, necessary for interaction with the beta subunit, and a protein kinase domain spanning from 39–324 residues. The structure of CK2, as given in Figure 1A,B, comprises several helices and beta-sheets that are connected by the surrounding loops. A co-crystal structure of CK2-CX-4945 has been resolved and reported to interact through direct contact by establishing hydrogen bonds with Lys49, Tyr50, Ser51, Lys68, and Glu114 residues from the catalytic domain. Molecular docking of CX-4945 with CK2 revealed a docking score of −9.57 kcal/mol, and the interaction pattern is given in Figure 1C. The 2D structure of CX-4945 is also given in Figure 1D. Although several molecules are available to inhibit CK2 due to a direct and indirect role of CK2 in drug resistance through efflux or DNA repair mechanisms, this has made it challenging to successfully treat cancer. Therefore, more efforts are needed to design more robust binding small-molecule inhibitors that could overcome the mechanism of drug efflux and DNA repair-based resistance.
The discriminatory performance of the XP docking protocol was evaluated using retrospective enrichment analysis against a benchmark set of active compounds and decoys, and the results are summarized in Figure 2A–F. As shown in Figure 2A, the receiver operating characteristic curve yielded an ROC–AUC of 0.748, indicating that the docking protocol achieved clear separation between active and inactive compounds and performed substantially better than random ranking. Because ROC–AUC can overestimate performance in imbalanced datasets, we further assessed early recognition using the precision–recall framework. In Figure 2B, the precision–recall curve produced an average precision (AP) of 0.423, confirming meaningful recovery of active molecules despite the large excess of decoys. The practical value of the docking protocol for virtual screening is further demonstrated by the cumulative enrichment curve in Figure 2C, which rises steeply at the beginning of the ranked list, indicating preferential concentration of active compounds among the top-scoring molecules. Consistent with this trend, the early enrichment factor analysis in Figure 2D showed strong enrichment at low screening fractions, with EF@1% = 5.4, EF@2% = 4.9, EF@5% = 3.9, EF@10% = 3.1, and EF@20% = 2.5. These values indicate that the docking workflow enriched known actives several-fold above random expectation, particularly within the top-ranked subset that would be prioritized for follow-up analysis. Score distribution analysis further supported the discriminatory ability of the protocol. As shown in Figure 2E, active compounds displayed a clear shift toward more favorable docking scores compared with inactive compounds, although some overlap remained, which is expected for docking-based ranking methods. This result indicates that the scoring function captures relevant binding-related features, while also reflecting the inherent limitations of docking in completely separating actives from decoys. Finally, the PCA projection of ECFP4 fingerprints in Figure 2F showed that highly ranked compounds were distributed across multiple regions of chemical space rather than collapsing into a single dense cluster. This suggests that the docking protocol did not merely prioritize one redundant chemotype but instead retained chemical diversity among top-ranked hits, which is advantageous for downstream hit selection and lead optimization. Taken together, Figure 2 demonstrates that the XP docking workflow provides moderate-to-strong discrimination, meaningful early enrichment, and chemically diverse top-ranked candidates, thereby supporting its suitability as a structure-based prioritization tool in the present kinase inhibitor discovery pipeline.

2.2. Extra Precision Screening of Small Molecule Libraries

We screened our natural products libraries from South African, North African, Northwest African, and Coconut natural products databases to search for more potent compounds. Each library was subjected to quick preparation by using the QikProp option in the Schrodinger Maestro virtual screening tool. Then, Lipinski’s rule of five filter was applied to filter out the druggable molecules. The three-step extra-precision virtual screening of ~480,000 compounds yielded 151,534 compounds that obey the R5 rule and could be used for further processing. In the high-throughput virtual screening step (HTVS), 2663 compounds were reported to exhibit a pharmacological activity against CK2. These 2663 compounds were then subjected to a standard precision (SP) docking approach, where the top 10% (264) compounds were observed to have better binding potential than the others. In the final stage, extra-precision docking (XP), 29 compounds were obtained as the final best hits. After the removal of duplicates, seven compounds were obtained to have a docking score better than the control (−9.57 kcal/mol). Among these top six hits, Anastatin B reported the best docking score of −13.12 kcal/mol among all. Anastatin B is widely used as a pharmacological agent and has been reported to inhibit liver cancer, cellular malignancies, and mushroom tyrosinase. Anastatin B reported seven hydrogen bonds involving Arg47, Lys68, Val116, and Asp175 as the key interacting residues. In the case of other types of interactions, Gly46, Val53, Val66, Ile95, Met163, and Ile174 are involved to stabilize the binding. Although Anastatin B reported a higher docking score than the control, it can also be seen that the interactions are also conserved as the reported crystalized structure and thus produces similar pharmacological properties by blocking these essential residues. On the other hand, 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one, which is anthraquinone and possesses hepatoprotective, nephroprotective, anti-cancer, and anti-inflammatory properties, reported a docking score of −12.36 kcal/mol and ranked as the second-best compound among all. Similarly, this compound reported multiple hydrogen bonding interactions with Lys68, Val116, and Asp175 residues, while the other interactions, such as hydrophobic, pie–pie, pie–alkyl, and pie–cation interactions involve Gly46, Val53, Val66, Ile95, Met163, and Ile174 corroborating with the interacting residues of Anastatin B. Interestingly these top hits also strongly align with the interacting residues of CK2 that are previously reported by Sun et al. and others that blocking these residues produces significant pharmacological results [16,17]. This shows that these two compounds possess robust pharmacological potential and, therefore, could be used as a clinical candidate against CK2 in cancer chemotherapy. The 3D and 2D binding modes for Anastatin B and 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one are given in Figure 3A,B.
Rhein, which is an anthraquinone that possesses hepatoprotective, nephroprotective, anti-cancer, and anti-inflammatory properties, reported a docking score of −12.36 kcal/mol by interacting with the key residues. The interactions involve Lys68, Glu114, Val116, and Asp175 residues and, therefore, block the activity of CK2. Next, we evaluated the binding mode of 6-Methoxyquercetin, which revealed a docking score of −11.60 kcal/mol and established several hydrogen and other interactions. A total of seven hydrogen bonds, including Leu45, Lys68, Glu114, Val116, and Asp175, were involved in bonding. On the other hand, Val53, Val66, Lys68, Ile95, Met163, and Ile174 were involved in interactions other than hydrogen bonding. This binding shows that the key residues are blocked by 6-Methoxyquercetin and thus inhibit the activity of CK2 in various cancers. The 3D and 2D binding modes for 6-Methoxyquercetin and Rhein are given in Figure 4A,B.
Parietinic acid demonstrated strong binding affinity toward CK2 with a docking score of −11.65 kcal/mol, forming key interactions within the ATP-binding pocket. Notably, it engaged critical catalytic residues Lys68, Glu114, and Val116, which are known to play essential roles in kinase activity and ligand stabilization. In addition, multiple hydrophobic contacts involving Leu45, Val53, Val66, Ile95, Phe113, Met163, and Ile174 contributed to stabilizing the ligand within the binding cavity, suggesting favorable complementarity activity between the ligand scaffold and the hydrophobic core of the active site. Similarly, aloe emodin acetate exhibited a comparable binding profile with a docking score of −11.73 kcal/mol and formed six hydrogen bonds, primarily involving Glu114, Val116, Asn118, and Asp175. These residues are located within or adjacent to the hinge and catalytic regions of CK2, indicating that the ligand establishes strong polar interactions that are critical for binding specificity. In addition to hydrogen bonding, aloe emodin acetate also formed extensive hydrophobic interactions with residues such as Leu45, Val53, Val66, Lys68, Ile95, Phe113, Met163, and Ile174, reinforcing its stable accommodation within the binding pocket. The interaction patterns of both compounds are illustrated in Figure 5A,B, which depict their 3D binding conformations and 2D interaction maps. These figures confirm that both ligands occupy the canonical active site and engage key residues involved in ATP recognition and catalysis. Complementary structural and interaction details for all top-ranked compounds, along with their docking scores and interacting residues, are summarized in Table 1, providing a comparative overview of binding modes across the ligand set. Importantly, both compounds share structural features that enable consistent engagement with conserved active site residues, suggesting a common binding mechanism. Their docking scores are notably more favorable than those of the reference control ligand, indicating stronger predicted binding affinity. While docking scores alone do not guarantee biological activity, the consistent interaction with key catalytic residues and the convergence of binding modes support their potential as CK2 inhibitors. These compounds were, therefore, selected for further downstream analyses, including molecular dynamics simulations and binding free energy evaluation. Furthermore, the observed interaction profiles are consistent with previously reported CK2 inhibitor binding patterns, where engagement with hinge-region residues such as Glu114 and Val116 is critical for activity. However, the present compounds exhibit enhanced binding scores and comparable or improved interaction networks, suggesting their potential as promising lead candidates for further optimization [18,19,20].

2.3. Molecular Simulation-Based Stability Assessment

To exert the intended pharmacological potential of a drug, dynamic stability is essential. A dynamically stable complex indicates the robust binding of a small molecule and determines the efficacy of the drug. It has been widely used to quantify the dynamic stability as a function of time using the simulation trajectory. Considering the importance of this approach, RMSD, we used the simulation trajectory for each complex, and the stability was assessed. The RMSD for the co-crystallized complex was calculated and used as a reference for the comparative analysis of the top hits. As given in Figure 6A, the control complex reached the equilibrium position at 18 ns and maintained an RMSD of 2.5 Å. Afterward, the production phase started, and the RMSD continued in a uniform pattern. At 60 ns, an abrupt increment was observed where the RMSD level reached up to 3.10 Å. Later, the same pattern was observed with no significant structural perturbation, though the RMSD continued to increase gradually until the end of the simulation. An average RMSD of 2.45 Å was calculated for this complex. In contrast, the Anastatin B–CK2 complex reported lower RMSD values. The RMSD started from 0 and reached a maximum of 2.0 Å at 10 ns, and then a gradual decline in the RMSD was observed. At this point, the RMSD converged with the control complex, thus showing a similar atomic configuration attained by both complexes. A gradual decrease in the RMSD level was observed for the rest of 50 ns, and then after reaching 70 ns. After 70 ns, a smaller increment was observed, and the complex stabilized at 1.80 Å, maintaining the same level until the end of the simulation. An average RMSD for this complex was estimated to be 1.66 Å. It shows that the newly identified molecule, i.e., Anastatin B, is behaving more stable than the control complex and, therefore, produces a stronger pharmacological force than the control, and thus further validates the highest docking score it had in the initial round of screening. Moreover, the RMSD for 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one reported a similar pattern as the control. The RMSD converged with the control complex with stable dynamic behavior, and with no significant structural perturbations reported. However, the RMSD reported stable dynamics, but still smaller deviations between 100 and 140 ns. The RMSD after 140 ns maintained a lower level than the control. The average RMSD for the 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one–CK2 complex was calculated to be 1.95 Å. The RMSD graph for the 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one is given in Figure 6B. Nonetheless, the 6-methoxyquercetin complex also demonstrated lower RMSD values than the control. With no significant structural perturbations in the dynamics condition, the complex equilibrated and stabilized at 1.6 Å at 8 ns. The structure continued to follow the same pattern until 50 ns, and then a small, abrupt increment in the RMSD was observed at 58 ns. For a short period, the RMSD remained higher and then directly decreased and maintained the same level until the end of the simulation. An average RMSD for the 6-methoxyquerecetin–CK2 complex was calculated to be 1.89 Å. The RMSD graph for the 6-methoxyquerecetin–CK2 complex is given in Figure 6C. Similarly, the aloe emodin acetate, when bound to CK2, presented similar dynamic behavior. The RMSD started to incline until 90 ns, and then a uniform straight path was followed. However, the RMSD in the initial states was lower but presented a gradual increment behavior until 90 ns. No significant structural perturbation was experienced, and the complex was observed to be the most stable, with no small deviation. The RMSD graph for the aloe emodin acetate–CK2 complex is given in Figure 6D. On the other hand, the Parietinic acid– and Rhein–CK2 complexes, although demonstrating lower RMSDs, significant structural perturbations were recorded. In the case of Parietinic acid–CK2 complex, the RMSD continued to increase gradually until 120 ns, and then an abrupt decline with minor deviations at different time intervals was observed. The last 10 ns experienced an increased and decreased pattern in the RMSD of the complex. This shows that the binding of these molecules may be released, or internal loops are largely moved, thus causing significant perturbations upon the binding of Parietinic acid. The RMSD graph for the Parietinic acid–CK2 complex is given in Figure 6E. On the other hand, the Rhein–CK2 complex also kept a lower RMSD than the control, but after 140 ns, significant deviations were observed in the RMSD. The RMSD was observed to be lower in the first 140 ns, with a minor deviation at 75 ns, and reached a maximum at 138 ns. The RMSD then decreased back at 170 ns, and then an abrupt increment was experienced. This shows a similar destabilization effect produced by the binding of Rhein to CK2. The RMSD graph for the Rhein–CK2 complex is given in Figure 6F. Overall, these findings demonstrate that our identified novel hits potentially target CK2 more robustly than the control by binding more stable than the control and producing effective pharmacological effects through the blockage of key residues in a dynamic condition. Our results further demonstrate that Anastatin B and aloe emodin acetate are more favorably stable than the other and thus should be prioritized in the experimental testing for clinical applications. For instance, small molecules that bind to CK2 and result in stable dynamic behavior are more pharmacologically active than those that are unstable [21,22].

2.4. Structural Compactness Analysis

Estimation of the size of the protein/compactness reveals significant information regarding the binding and unbinding of the small molecules in a cavity. It is an essential approach to determine the behavior of the protein in the apo and holo states. It has been widely used to extract essential features that are necessary for the pharmacological inhibition of a particular target, specifically the binding and unbinding events. Considering the essential role of Rg in reflecting the potential of small molecules, we also used this approach as a function of time to understand the protein’s compactness. As given in Figure 7A, the control complex, i.e., CK2–Egallic acid, reported an average Rg of 20.60 Å with no significant deviation throughout the simulation. It shows that the complex keeps a stable state with minimal unbinding events and thus demonstrates the robust pharmacological features of this compound. On the other hand, Anastatin B in complex with CK2 reported a similar behavior in terms of structural compactness. No significant deviation was observed, and an average Rg of 20.60 Å was also calculated here. The Rg graphs for the control and Anastatin B strongly corroborate with the RMSD results and thus show uniform dynamic behavior. The Rg results for the control and Anastatin B bound to CK2 are given in Figure 7A. On the other hand, the 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one–CK2 complex initially reported a higher Rg value, from 10–110 ns, and then the Rg values decreased and converged with the control complex. This loss of compactness is due to the N-terminus and C-terminus corresponding to regions 37–96 and 321–327 amino acids. The N-terminus is particularly important where the binding of a small molecule takes place. Upon the outward movement, the drug is released, which is due to the weaker binding of the drug to hold the N-terminus tighter and thus releases outside. An average Rg for the 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one–CK2 complex was calculated to be 20.75 Å, which is higher than the control and Anastatin B complexes. The Rg results for the 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one–CK2 complex are given in Figure 7B. The 6-methoxyquerecitin reported similar behavior as the RMSD and kept a tighter packing of the protein due to the robust binding of 6-methoxyquerecitin. After 120 ns, the Rg increased a little, but that is due to the C-terminus movement, and hence shows minimal unbinding events experienced by this complex. An average Rg for the 6-methoxyquerecitin–CK2 complex was calculated to be 20.65 Å. The Rg results for the 6-methoxyquerecitin–CK2 complex are given in Figure 7C. The Rg results for the aloe emodin acetate–CK2 complex also demonstrated similar behavior as the RMSD. With an increment in the start of the trajectory, the Rg pattern super-aligns with the control, resulting in RMSDs with no significant increase or decrease in Rg values. It is thus showing stable binding of aloe emodin acetate with CK2 and reports an average Rg of 20.57 Å. The Rg graph for the aloe emodin acetate–CK2 complex is given in Figure 7D. Unlike the aloe emodin acetate, the Parietinic acid and Rhein reported fluctuations in the Rg pattern. The Rg of the Parietinic acid–CK2 complex reported significant structural unwinding and causes an increase in protein size. It was also observed that the N-terminus opens up and releases the drug, which consequently causes the lower binding and affinity for Parietinic acid. The Rg graph for the Parietinic acid–CK2 complex is given in Figure 7E. On the other hand, though the Rg for the Rhein–CK2 complex remains lower during the first 1–120 ns, a decreased/increased pattern was observed. An average Rg for the Rhein–CK2 complex was calculated to be 20.68 Å. The Rg graph for the Rhein–CK2 complex is given in Figure 7F. Overall, these findings show that some of these compounds produce stronger pharmacological potentials by showing minimal unbinding events, while some are released from the cavity due to the N-terminus movement, and, therefore, Anastatin B, 6, Methoxy-quercetin, and aloe emodin acetate are the best hits that should be prioritized in clinical testing against CK2 in breast cancer. Our results strongly surpass the previous results, where the maximum unbinding events were reported; however, the results demonstrated that these compounds in complex with CK2 demonstrate stable compact packing of the protein and, therefore, act as stronger pharmacological candidates than the previously reported ones [18].

2.5. Root Mean Square Fluctuation Analysis (RMSF)

Residue fluctuation indexing is an essential parameter that is helpful in molecular recognition, drug discovery, catalysis, and other biological functions. Understanding the flexibility of each residue is essential in the process of drug discovery. Therefore, in the current study, we also used the simulation trajectories to calculate the residues’ flexibility. As shown in Figure 8A, region 1–75, which corresponds to (37–111), demonstrated higher fluctuations. This is a domain that is connected to the other part through a loop, and, therefore, the flexibility increases the movement of this loop and causes higher fluctuation. In the different regions, such as 76–225, these presented more alike with minimal fluctuations. The region 226–260 corresponds to a loop and a helix that covers the active site residues of the catalytic domain and causes opening and closing that help in pushing the drug in and out during simulation. The regions that fluctuate more frequently are highlighted in Figure 8B. In sum, the binding of different drugs, such as the control, Anastatin B, 6, Methocyquerecitine, and aloe emodin acetate, reduces the internal fluctuation of region 37–161 and stabilizes the internal residues’ fluctuation, consequently producing the pharmacological effects.

2.6. Hydrogen Bonding Analysis

Macromolecular complexes, particularly protein coupling with small molecular or another protein partner, are primarily driven by hydrogen bonding and hydrophobic contacts. The environment of protein interfaces is enriched with water molecules that work with the residues to form hydrogen bonds. Thus, it is important to understand the hydrogen bonding landscape in a molecular association. For instance, previously, hydrogen bonding was predicted to estimate the strength of the association between two molecules, which shed light on different mechanisms. Here, we have employed a similar approach to understand the differences in hydrogen bonding between the control and the bound lead molecule complexes. The average number of hydrogen bonds was higher for Anastatin B, aloe emodin acetate, and Rhein than for the control. While for the others, during the simulation, the number of hydrogen bonds is less than that of the control. Hence, this further confirms the robust pharmacological potential of Anastatin B, aloe emodin acetate, and Rhein in contrast to the control drug. In conclusion, our results report Anastatin B, aloe emodin acetate, and Rhein as the promising inhibitors for CK2 activity in breast cancer. The hydrogen bonding results are given in Figure 9A–F.

2.7. Binding Free Energy Calculation

To re-evaluate the binding conformation, we used the binding free energy calculation approach using both the MM-PBSA and MM-GBSA methods. These methods are highly applicable, accurate, fast, and reliable in terms of evaluating the activity of a small molecule that is predicted by docking simulation. This approach has been used for years to design novel inhibitors for various targets in different diseases such as cancer, diabetes, COVID-19, monkeypox virus, etc. Considering the higher applicability of these methods, we also utilized them to re-evaluate the binding potential of our top hits from docking. The binding free energy was calculated for different time intervals: 1–10 ns (equilibrium state), 11–30 ns (region of observed de-stability), and 185–200 ns, which represented the most stable phase of all complexes. The MM-PBSA results for the control drug during 1–10 ns revealed a van der Waals (vdW) contribution of −33.58 ± 0.15 kcal/mol, while Anastatin B exhibited a stronger vdW energy of −42.78 ± 0.32 kcal/mol. Aloe emodin acetate, 6-methoxyquercetin, and the newly engineered scaffold showed vdW values of −39.02 ± 0.13 kcal/mol, −38.41 ± 0.15 kcal/mol, and −38.39 ± 0.31 kcal/mol, respectively. The compounds 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]-pyran-6-one, Parietinic acid, and Rhein demonstrated comparatively lower vdW values of −24.99 ± 0.19 kcal/mol, −29.14 ± 0.11 kcal/mol, and −27.27 ± 0.29 kcal/mol, respectively. Similar vdW energies were also observed from the MM-GBSA method. For the electrostatic energy component (EEL) within 1–10 ns, the control reported −21.62 ± 0.31 kcal/mol, whereas the top hits displayed values of −9.12 ± 0.21 kcal/mol for Anastatin B, −5.97 ± 0.21 kcal/mol for aloe emodin acetate, −122.47 ± 0.53 kcal/mol for Parietinic acid, −9.12 ± 0.18 kcal/mol for 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]-pyran-6-one, −13.47 ± 0.20 kcal/mol for 6-methoxyquercetin, and −175.31 ± 1.46 kcal/mol for Rhein. The total binding free energy (TBE) during 1–10 ns confirmed that the control drug had a TBE of −15.52 ± 0.16 kcal/mol, while Anastatin B recorded −26.85 ± 0.20 kcal/mol, aloe emodin acetate −17.28 ± 0.10 kcal/mol, Parietinic acid −16.37 ± 0.15 kcal/mol, 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]-pyran-6-one −16.37 ± 0.15 kcal/mol, 6-methoxyquercetin −23.95 ± 0.18 kcal/mol, and Rhein −14.82 ± 0.37 kcal/mol. The corresponding MM-GBSA TBEs were −16.29 ± 0.16 kcal/mol for the control, −29.23 ± 0.20 kcal/mol for Anastatin B, −33.36 ± 0.17 kcal/mol for aloe emodin acetate, −16.15 ± 0.89 kcal/mol for Parietinic acid, −12.91 ± 0.15 kcal/mol for 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]-pyran-6-one, −23.86 ± 0.17 kcal/mol for 6-methoxyquercetin, and −17.44 ± 0.66 kcal/mol for Rhein. During the 11–30 ns interval, the MM-PBSA results revealed vdW energies of −34.85 ± 0.13 kcal/mol for the control, −43.68 ± 0.14 kcal/mol for Anastatin B, −38.83 ± 0.08 kcal/mol for aloe emodin acetate, −30.65 ± 0.10 kcal/mol for Parietinic acid, −21.21 ± 0.18 kcal/mol for 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]-pyran-6-one, −37.02 ± 0.12 kcal/mol for 6-methoxyquercetin, and −24.19 ± 0.15 kcal/mol for Rhein. The corresponding EEL values were −24.13 ± 0.25 kcal/mol for the control, −8.22 ± 0.10 kcal/mol for Anastatin B, −5.97 ± 0.21 kcal/mol for aloe emodin acetate, −125.01 ± 0.61 kcal/mol for Parietinic acid, −7.09 ± 0.12 kcal/mol for 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]-pyran-6-one, −12.83 ± 0.11 kcal/mol for 6-methoxyquercetin, and −169.14 ± 1.13 kcal/mol for Rhein. The total binding free energy for 11–30 ns was −16.14 ± 0.12 kcal/mol for the control, −27.32 ± 0.10 kcal/mol for Anastatin B, −26.62 ± 0.13 kcal/mol for aloe emodin acetate, −17.58 ± 0.13 kcal/mol for Parietinic acid, −14.17 ± 0.12 kcal/mol for 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]-pyran-6-one, −23.16 ± 0.10 kcal/mol for 6-methoxyquercetin, and −15.01 ± 0.16 kcal/mol for Rhein. Despite minor declines in stability for certain complexes, Anastatin B continued to display an increasing TBE, highlighting its stronger pharmacological potential against CK2. In the final stabilized phase (185–200 ns), the control complex showed a TBE of −16.51 ± 0.08 kcal/mol, while Anastatin B maintained the highest stability with a TBE of −27.84 ± 0.08 kcal/mol. Aloe emodin acetate exhibited a TBE of −24.13 ± 0.08 kcal/mol, Parietinic acid –22.34 ± 0.11 kcal/mol, 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]-pyran-6-one −14.15 ± 0.07 kcal/mol, 6-methoxyquercetin −19.07 ± 0.07 kcal/mol, and Rhein −14.53 ± 0.04 kcal/mol. Using the MM-GBSA method, the corresponding TBEs were −17.25 ± 0.08 kcal/mol for the control, −27.97 ± 0.07 kcal/mol for Anastatin B, −27.47 ± 0.07 kcal/mol for aloe emodin acetate, −22.78 ± 0.91 kcal/mol for Parietinic acid, −11.68 ± 0.63 kcal/mol for 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]-pyran-6-one, −18.42 ± 0.05 kcal/mol for 6-methoxyquercetin, and −13.15 ± 0.15 kcal/mol for Rhein. Overall, the progressive strengthening of Anastatin B’s binding energy across time intervals demonstrates its sustained interaction and enhanced stability within the CK2 binding pocket. Aloe emodin acetate and 6-methoxyquercetin also exhibited strong binding comparable to the control, indicating that these molecules could serve as promising CK2 inhibitors with therapeutic potential, pending further pharmacological validation. Our compounds report better binding free energies than the previously reported compounds, further validating the robust binding in contrast to the previously discovered molecules [21,23,24,25,26]. All the results, including the vdW, EEL, PB, GB, Delta G gas, Delta G solvated, and the total binding free energies, are given in Table 2 and Table 3. All the energies are given in kcal/mol.

2.8. Frontier Molecular Orbital (FMO) and Electronic Structure Analysis

The analyzed compounds exhibited distinct HOMO–LUMO profiles that reflect their stability and electronic transitions. Parietinic acid displayed a HOMO at −6.85 eV, a LUMO at −3.05 eV, and a gap of 3.80 eV, suggesting moderate stability and potential reactivity through π–π interactions. Its relatively strong dipole moment (2.36 D total) indicates favorable polarity for biomolecular interactions. Rhein exhibited a slightly wider gap (3.86 eV) with a more negative HOMO (−7.02 eV), suggesting higher oxidative stability but reduced electron-donating ability compared to Parietinic acid. 6-Methoxy quercetin and 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one both showed narrower band gaps (4.21 and 4.37 eV, respectively), reflecting higher electronic polarizability. Their higher LUMO levels (−1.36 and −1.28 eV) indicate stronger electron-accepting potential, which may enhance radical scavenging and antioxidant activity. The dipole moment of the pentahydroxy derivative (8.39 D) was particularly large, suggesting pronounced polarity and solubility, factors relevant for pharmacological interactions. Anastatin B exhibited a HOMO of −5.45 eV and a LUMO of −1.27 eV, yielding a gap of 4.18 eV. Its electronic properties, combined with a substantial dipole moment (3.62 D), suggest enhanced reactivity and intermolecular binding potential. In contrast, aloe emodin revealed a HOMO at −6.85 eV and a LUMO at −2.86 eV, with a gap of 3.99 eV. Its intermediate dipole moment (2.70 D) and electronic gap suggest a balanced profile between stability and reactivity, consistent with the reported pharmacological versatility of anthraquinones. Overall, the HOMO–LUMO analysis highlights subtle variations in electronic structure among these polyphenolic compounds. Narrower band gaps and higher dipole moments correspond to enhanced electron transfer and interaction potential, supporting their reported antioxidant and bioactive properties. The combination of orbital mapping, energy gap analysis, and dipole evaluation provides valuable insight into the electronic factors underpinning their biological functions. The results are given in (Supplementary Figure S1).
To complement the docking and molecular dynamics results, density functional theory (DFT) calculations were employed to examine the frontier molecular orbitals (FMOs) of the selected scaffolds. Both the highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) were analyzed to gain insights into electron distribution, reactivity, and possible interaction sites with the Casein Kinase II active site (Supplementary Figure S2). For the representative scaffold, the HOMO energy was calculated at −7.02 eV, while the LUMO energy was found at −3.16 eV, giving a HOMO–LUMO gap (ΔE) of 3.86 eV. This moderate gap reflects a balance between chemical stability and reactivity, making the scaffold electronically suitable for bioactive interactions. Molecules with significantly smaller gaps often display excessive reactivity, whereas excessively large gaps correspond to chemically inert systems. The HOMO density plots were primarily localized on amide, carbonyl, and nitrogen-rich moieties, highlighting their ability to act as electron donors during binding, facilitating hydrogen bonding with polar residues in the ATP-binding cleft. In contrast, the LUMO densities were concentrated on the aromatic π-system, which suggests a strong propensity for π–π stacking interactions with hydrophobic and aromatic residues. In certain scaffolds, dual orbital delocalization was observed, enabling a combined donor–acceptor behavior that strengthens ligand–protein stabilization. The electronic features identified here complement the binding free energy decomposition from MM/GBSA and molecular simulations, where hydrogen bonding and π-stacking were consistently highlighted as key stabilizing interactions. Moreover, the HOMO–LUMO alignment confirms that polar regions drive specificity, while aromatic orbital delocalization drives affinity, offering a quantum-level validation of the observed QSAR and docking trends. Thus, the frontier orbital analysis not only provides quantum-mechanical evidence of electronic suitability but also benchmarks the scaffolds’ bioactivity potential against other computational approaches. This integrated interpretation reinforces their candidacy as potent CK2 inhibitors.

2.9. Machine Learning-Based Activity Prediction

Across all descriptor and fingerprint combinations, our AI-driven QSAR framework demonstrated consistent predictive capacity, with several models achieving performance levels indicative of reliable generalization. In response to reviewer concerns, cross-validation metrics (CV R2 ± SD) are now explicitly reported for all models (Table 4), enabling a more rigorous assessment of model robustness. Importantly, the relationship between train R2, CV R2, and test R2 across models does not support the presence of pathological overfitting. For the best-performing feature spaces, particularly fingerprint-based (FP) representations, CV R2 values closely align with external test R2 values. For example, the stacking ensemble on FP features achieved test R2 = 0.690 and CV R2 = 0.693 ± 0.0036, while Random Forest and Gradient Boosting models yielded test R2 values of 0.668–0.664 with corresponding CV R2 values of 0.666 ± 0.041 and 0.664 ± 0.043, respectively. This strong agreement between CV and test performance indicates stable model generalization rather than overfitting to the training data. While train R2 values are consistently higher than expected for ensemble learners, the observed train–test gaps (ΔR2 ~0.13–0.20 for most top-performing models) remain moderate and do not translate into inflated CV performance. Moreover, the relatively low standard deviations in CV R2 (typically ~0.03–0.06) demonstrate low variance across folds, further supporting model stability. Tree-based ensemble methods dominated performance across all feature sets. Random Forest, Gradient Boosting, and Histogram Gradient Boosting consistently achieved test R2 values in the range of 0.64–0.68, with closely matching CV R2 values, confirming their robustness. Fingerprint-derived features (FP) emerged as the most predictive representation, while combined feature spaces (e.g., 2D + FP, 3D + FP, and full integration) further enhanced performance, highlighting the complementary nature of structural and physicochemical descriptors.
In contrast, 2D and 3D descriptors alone exhibited lower predictive power (test R2 of 0.45–0.65), reflecting their limited ability to capture structure–activity relationships fully. Stacking was implemented as a post hoc ensemble strategy to assess whether combining the strongest performing regressors within each feature space could improve predictive stability. The Graph Neural Network (GIN) baseline showed minimal predictive capability (test R2 = 0.020), likely due to data size limitations and lack of extensive hyperparameter optimization, further emphasizing the strength of classical ensemble approaches in moderate-sized datasets. Overall, the consistency between CV and external test performance, combined with controlled ΔR2 values and low CV variance, demonstrates that the developed QSAR models are robust, generalizable, and not overfitted, supporting their suitability for drug discovery applications. The optimal stacking model (2D + FP) demonstrated robust predictive performance, with a training RMSE of 0.486, a test RMSE of 0.642, and a cross-validation RMSE of 0.652 ± 0.052, consistent with the comprehensive benchmarking results presented in Supplementary Table S1. All the results are given in Table 4, while the machine learning model benchmarking for CK2 inhibitor prediction is shown in Figure 10.

2.10. Determining the pIC50 Values via Stacking Ensemble Model

For the CX series, the corresponding IC50 values were converted to pIC50 using the equation pIC50 = −log10(IC50). The reported pIC50 range (5.70–9.52) corresponds to experimentally observed IC50 values between 0.3 and 2000 nM, obtained from 23 assays. Similarly, the newly identified hits exhibited comparable predicted IC50 ranges. Specifically, the Parietinic acid–CK2 complex showed a pIC50 of 7.263, the Rhein–CK2 complex 7.661, 6-methoxyquercetin 7.408, 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one 8.140, the Anastatin B–CK2 complex 9.376, and the aloe emodin–CK2 complex 7.113. On the other hand, the predicted pIC50 values for the reference compounds were CX-4945 (pIC50 ≈ 9.0) and SGC-CK2-1 (pIC50 ≈ 8.38), respectively. Although the CX series and our identified scaffolds belong to distinct chemical classes, the pIC50 values of our selected top hits demonstrate consistency with experimentally reported ranges, and the comparable pIC50 profiles indicate that the newly proposed hits exhibit similar binding potency toward CK2. Notably, Anastatin B exhibited a predicted pIC50 comparable to CX-4945 and SGC-CK2-1, suggesting near-clinical level potency, while compounds such as 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one and Rhein demonstrated activity profiles consistent with selective chemical probes like SGC-CK2-1. This suggests that these compounds possess a strong inhibitory potential and can serve as promising starting points for structure-guided lead optimization to achieve enhanced activity and selectivity.

3. Materials and Methods

3.1. CK2 Structure Retrieval and Preparation

The crystallographic coordinates of Casein Kinase 2 (CK2) were retrieved from the protein databank (RCSB) using the accession number: 3NGA [27]. The CK2 kinase complex with CK2-CX-4945 was subjected to structure preparation and refinement. Using the protein preparation wizard in Schrodinger Maestro 2023, the complex was prepared. Any missing residue was modeled with the fill-in missing loop and side chains using the Prime tool [28]. The structure was preprocessed, and the protonation states were correctly defined. The structure was optimized by using PROPKA and setting the pH to 7.0. Then, the OPLS force field was used to minimize the complex.

3.2. Benchmarking and Validation of the Molecular Screening Protocol

A benchmark library was constructed by retrieving 534 experimentally validated actives from ChEMBL (accessed on 25 Januray, 2026) and related literature, which were standardized in RDKit (salt removal, tautomer/valence normalization, stereochemistry preservation) and converted to 3D with protonation at pH 7.0 and MMFF minimization [29,30]. To balance the dataset, 2922 property-matched but topologically distinct decoys were generated using a DUD-E workflow, ensuring similarity in MW, cLogP, HBD/HBA, rotatable bonds, and charge while avoiding scaffold overlap; these were likewise protonated, minimized, and deduplicated [31]. The receptor was prepared in Maestro with the Protein Preparation Wizard (bond orders, H-bond optimization, restrained OPLS minimization), and Glide Extra Precision (XP) docking was performed using grids centered on the crystallographic ligand, enabling conformational sampling and XP scoring [32]. Docking scores were oriented such that higher values consistently represented stronger predicted binding. Evaluation metrics included ROC–AUC, PR–AUC, enrichment factors (EF@1/2/5%), Top-N enrichment (50 and 384 ligands), and early recognition metrics (BEDROC, RIE at α = 20, 80.5, 160.9) with 500 bootstrap confidence intervals [33]. Visualization consisted of shaded ROC and PR curves, cumulative enrichment with early percentile markers, gradient EF bar plots, violin + jitter distributions of docking scores, and a PCA projection of ECFP4 fingerprints to explore chemical space diversity [34].

3.3. Exploring the Chemical Space for Novel Hits Identification

For the CK2, we prepared the compound library using a database management tool in Schrodinger Maestro [35]. We built a library of retrieved natural product libraries, such as the North African natural compounds database (NANDB), the East African natural compounds database (EANDB), and the coconut natural compounds databases [36,37]. The QikProp option was executed, and a prefilter by Lipinski’s Rule was applied to prepare and filter these databases for drug-like compounds that obey R5 rules. Each ligand was prepared by using the “ligprep” option in Schrodinger Maestro by using the Epik option and removing high-energy ionization/tautomer states. The 2D stereo properties were used to obtain stereochemical information, while 4 stereoisomers were retained. A single low-energy conformation was generated for each ligand.

3.4. Receptor Grid Generation and Molecular Screening

The prepared drug databases and CK2 receptor were used as input for the identification of potential hits in the chemical space of natural products. The active site was defined as the grid center: x = 1.505, y = −1.461, and −10.265. The inner box was defined as 10 × 10 × 10, while the outer box was defined as 25.33 × 25.33 × 25.33 each. A three-step approach, HTVS, SP, and XP docking, was applied for more accurate results. A flexible docking with a penalized non-planar conformation approach was used, while post-docking minimization was also performed, with 1 pose for each compound using the constraints for each complex. The top 10% were then selected for the second step. In the HTVS, all the states were retained, while in the SP approach, only good-scoring states were selected for the next step. From XP docking, top hits were selected by using the best-scoring hits option. The interactions of the top hits were visualized in PyMOL and Discovery Studio Visualizer [38,39].

3.5. Molecular Dynamics Simulation

The pharmacological potential and inhibitory mechanism of the top hits were explored through atomic simulation using the Amber 24 simulation software. In the first step, the ligand was parameterized, and for proteins, the “FF19SB” force field was used [40]. Then, we solvated each complex by adding a cubic box of TIP3P water model with a cut-off distance of 12 Å. For neutralization, counter ions, i.e., Na+ and Cl, were supplied to the system [41]. To ensure the system was fit for optimal performance, gentle minimization of 12,000 and 5000 steps was performed by using the steepest descent and conjugate gradient algorithms. Then, each system was gradually heated from 0 to 325 (K) [42], and the hydrogen bonds were treated via the SHAKE method. The Langevin thermostat was used for temperature, while the barostat was used for maintaining a pressure of 1 atm and a temperature of 310 K [43]. Using the pressure and temperature, 10-nanosecond (ns) time equilibrations were achieved. Afterward, the MD simulation was conducted for 200 nanoseconds (ns) total time at a constant temperature of 325 K [44]. We further thoroughly examined the conformational alterations that eventuated during simulations. To analyze these alterations, we utilized the CPPTRAJ module within AMBER20 to investigate several factors, namely root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (RoG), and principal component analysis (PCA). The radius of gyration (RoG) is a valuable measure for analyzing protein size and compaction [45]. Finally, data visualization and analyses were performed with the help of the Origin Pro Software 2024b version.

3.6. Binding Free Energy Calculation

The binding free energy calculation is the most widely used approach to re-evaluate the accurate binding strength of a system [46]. Because of the broader applications, owing to advantages over other methods, i.e., computational efficiency and less time requirements, this approach has been widely implemented. For each complex, the van der Waals (vdW), electrostatic, solvent-accessible surface area (SASA), as well as the generalized Born (GB) components, were determined based on the entire simulation trajectory. This method merges molecular mechanics simulations, which describe the interactions between atoms, with implicit solvent models, which describe the interactions between the protein and solvent, to compute the binding strength [47,48,49]. Hence, we also applied this approach here to accurately compute the total binding free energy of the protein–ligand complexes. Mathematically, the binding free energy can be estimated as:
Δ G ( b i n d )   =   Δ G ( c o m p l e x )     [ Δ G ( r e c e p t o r )   +   Δ G ( l i g a n d ) ]
Different contributing components of total binding energy were calculated by the following equation:
G = G b o n d   +   G e l e   + G v d W   + G p o l   + G n p o l

3.7. Frontier Molecular Orbital (FMO) and Electronic Structure Analysis

The electronic structure calculations were carried out using a custom HOMO–LUMO computational pipeline developed in Python, integrating RDKit for molecular preprocessing, PySCF for quantum chemical calculations, and PyMOL for orbital visualization [50,51]. Molecular structures were first generated from SMILES representations using RDKit. Hydrogen atoms were added explicitly, and the three-dimensional geometries were embedded using the ETKDGv3 algorithm followed by optimization with the Universal Force Field (UFF). These optimized geometries were converted into Cartesian coordinates (XYZ format) and served as input for quantum chemical calculations. All electronic structure computations were performed with PySCF (Python for Strongly Correlated Electron Systems). The Density Functional Theory (DFT) method was employed using the PBE0 functional with the 6-31G* basis set. For closed-shell molecules, restricted Hartree–Fock (RHF) or restricted Kohn–Sham (RKS) formalisms were applied, while restricted open-shell Hartree–Fock (ROHF) was used where required. Convergence criteria were set with a tolerance of 1 × 10−8 Hartree to ensure high accuracy. The frontier molecular orbitals (FMOs) were determined from the computed molecular orbital energy spectrum. The highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO) indices were identified based on electron count, and their orbital energies were extracted. The energy gap (ΔE = ELUMO − EHOMO) was calculated in electronvolts (eV) using a Hartree-to-eV conversion factor of 27.2114. Additionally, the dipole moment vector and magnitude were calculated to provide insights into the polarity and electronic distribution of each molecule. HOMO and LUMO orbital density maps were exported as cube files, and PyMOL scripts were automatically generated for high-resolution orbital visualization.

3.8. Machine Learning-Driven QSAR Modeling

We developed a comprehensive, AI-driven QSAR modeling pipeline to systematically evaluate CK2 inhibition (CHEMBL3629) using multi-level molecular representations and state-of-the-art machine learning. Bioactivity data were retrieved from the ChEMBL database under stringent filtering criteria (IC50 in nM, equality relations only), standardized to canonical SMILES, and rigorously curated to remove invalid or extreme values prior to transformation into pIC50 (−log10(IC50[M])) within a defined dynamic range of 3–12. Structural information was enriched by computing an extensive suite of physicochemical and topological 2D descriptors (including LogP, TPSA, Estate VSA, SMR VSA, PEOE VSA, Kier–Hall indices, and Chi connectivity indices), alongside geometry-refined 3D shape descriptors (asphericity, eccentricity, PMI1–3, radius of gyration, and spherocity index) generated from MMFF-optimized ETKDG conformers with fixed random seeds to ensure reproducibility. Complementarily, six orthogonal fingerprinting schemes (ECFP4/6, FCFP4/6, MACCS, RDKit) were computed using MolFeat to capture circular, pharmacophoric, and substructure-level features, followed by low-variance filtering (threshold = 0.001) and median imputation of missing values. Feature representations were assembled into modular spaces (2D, 3D, FP, and combined), enabling systematic evaluation of descriptor contributions. To mitigate descriptor redundancy and high dimensionality, feature selection was performed using an embedded tree-based approach (ExtraTrees) within the modeling pipeline. A Bemis–Murcko scaffold-based splitting strategy was applied to construct an external test set, ensuring structural independence between training and test compounds and preventing chemical series leakage. Within the remaining data, stratified sampling based on pIC50 distribution (five bins) was used for training–validation partitioning. A diverse panel of machine learning algorithms was evaluated, including linear models (Ridge, ElasticNet, Partial Least Squares), kernel-based methods (SVR), ensemble learners (Random Forest, ExtraTrees, Gradient Boosting, HistGradientBoosting), neural networks (MLP with 100–50 hidden layers), and GPU-accelerated boosting frameworks (XGBoost, LightGBM). Hyperparameters for key models were optimized using Optuna (30 Bayesian-guided trials with cross-validation), ensuring an appropriate bias–variance trade-off. All preprocessing steps, including imputation, scaling, and feature selection, were implemented within scikit-learn pipelines, ensuring transformations were confined to training folds and eliminating data leakage. To extend beyond descriptor-based representations, a Graph Isomorphism Network (GIN) was implemented using PyTorch Geometric with atom-level features, dual GINConv layers (hidden dimension = 64, ReLU activation), global add pooling, and early stopping (Adam optimizer, learning rate = 1 × 10−3, batch size = 64, patience = 8, maximum 60 epochs) to prevent overfitting. Ensemble learning was further explored through stacking, combining top-performing base learners with a Ridge regression meta-model to enhance predictive stability. Model performance was evaluated using R2, RMSE, and MAE across both cross-validation and an external scaffold-based test set, with overfitting monitored via ΔR2. Model robustness was further validated using Y-scrambling, which yielded near-zero predictive performance for randomized targets, confirming the absence of chance correlations. Applicability domain (AD) analysis was performed using k-nearest neighbor distance metrics to ensure predictions were confined to reliable chemical space. Prediction uncertainty was additionally quantified using conformal prediction (MAPIE), providing 90% prediction intervals and coverage estimates. Reproducibility was ensured by fixing random seeds (SEED = 42), logging software environments, enabling GPU acceleration where available, and archiving descriptors, feature matrices, and model outputs in machine-readable formats.
Stacking ensembles were constructed separately for each feature space using the StackingRegressor framework implemented in scikit-learn. After training the individual base regressors on the preprocessed training data, models with positive external test set performance were retained, and the top three base learners within each feature set were selected according to their test R2 values. These selected regressors were then combined into a two-level ensemble, where their predictions served as inputs to a Ridge regression meta-learner (alpha = 1.0). Before stacking, all input features were standardized using StandardScaler, and for high-dimensional feature spaces (>100 variables), embedded feature selection was performed using SelectFromModel with an ExtraTreesRegressor and a median importance threshold. The stacking model was trained on the scaled/selected training matrix and evaluated on the independent test set using R2, RMSE, and MAE. In the revised analysis, cross-validation performance for stacking ensembles was additionally computed using 5-fold KFold cross-validation on the same transformed training matrix, and the corresponding CV R2 mean and standard deviations were calculated. Collectively, this pipeline integrates classical QSAR rigor with modern machine learning and graph-based approaches, establishing a robust and reproducible framework for kinase inhibitor prediction under realistic chemical space constraints. [52,53,54].

3.9. Hardware and Reproducibility

All computations were performed on Linux with Python 3.10. Reproducibility was ensured by fixing random seeds (42) and recording all software versions. Where available, GPU acceleration (NVIDIA CUDA 12.4) was employed using RTX4090 with i9-13900K. Outputs, including feature matrices, trained models, and performance metrics, were saved in JSON and CSV formats for transparency.

4. Conclusions

This study delivers a comprehensive AI-guided discovery platform uniting data-driven learning with atomistic simulations to advance CK2-targeted therapy in TNBC. Through systematic benchmarking, several natural-product-derived scaffolds, particularly Anastatin B and aloe emodin acetate, demonstrated superior binding stability, dynamic compactness, and favorable quantum descriptors compared with the clinical reference CX-4945. The convergence of machine learning predictions, molecular dynamics validation, and quantum-mechanical energy profiling underscores the translational value of integrative computational pipelines in anti-cancer drug discovery. Despite the encouraging results, several limitations should be acknowledged. Model performance depends on the quality and diversity of experimental data, which may limit reliability for novel chemical space. Variability in bioactivity data can introduce uncertainty, and the use of multiple descriptors may still lead to redundancy and overfitting. Additionally, docking and MM-GBSA provide approximate binding estimates and do not fully capture protein flexibility or biological complexity. Future work should focus on larger high-quality datasets and experimental validation. Future optimization of these leads through fragment-growth and free energy perturbation approaches could yield next-generation CK2 inhibitors with enhanced potency and selectivity.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ph19050694/s1, Figure S1: HOMO–LUMO energy profile and frontier molecular orbital (FMO) visualization of top CK2 inhibitors; Figure S2: Frontier molecular orbitals of selected CK2 inhibitors showing electron density distribution; Table S1: Performance of machine learning models across descriptor and fingerprint feature sets for CK2 inhibitor prediction.

Author Contributions

Conceptualization, A.K., F.M.A., A.M., M.S., L.C.M. and A.A.; Methodology, A.K., F.M.A., A.M. and M.S.; Software, A.K. and M.S.; Validation, A.K., F.M.A., A.M., M.S. and L.C.M.; Formal analysis, A.K.; Investigation, A.K., A.M., M.S. and A.A.; Resources, A.K., M.S., R.M.A.-Z. and A.A.; Data curation, A.K., F.M.A., A.M., M.S. and R.M.A.-Z.; Writing—original draft, A.K., R.M.A.-Z., L.C.M. and A.A.; Writing—review & editing, A.K. and A.A.; Visualization, A.K., A.M., R.M.A.-Z., L.C.M. and A.A.; Supervision, A.K., R.M.A.-Z. and A.A.; Project administration, A.K., R.M.A.-Z., L.C.M. and A.A.; Funding acquisition, A.K., R.M.A.-Z. and A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Qatar Research Development and Innovation Council (Qatar National Research Fund) [grants No. ARG02-0421-240249 and ARG01-0601-230451] and Qatar University [grant No. QUPD-CPH-23/24-592]. The statements made herein are solely the responsibility of the authors.

Institutional Review Board Statement

No ethical approval was required for this computational study as it did not involve human participants, animals, or clinical data.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

Acknowledgments

All scripts and analysis pipelines used in this study (molecular dynamics post-processing, MM/PBSA automation, and ML benchmarking) are available on GitHub at https://github.com/Abbas24-AI/ML-QSAR-CK2.git, accessed on 16 April 2026, under the MIT License.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

WHOWorld Health Organization,
TNBCtriple-negative breast cancer,
HER2human epidermal growth factor receptor 2,
ERestrogen receptor,
PRprogesterone receptor,
ROC–AUCReceiver Operating Characteristic–Area Under the Curve
DFTDensity Functional Theory
RMSDRoot Mean Square Deviation
RMSFRoot Mean Square Fluctuation
PCAPrincipal Component Analysis
RgRadius of Gyration
AMBERAssisted Model Building with Energy Refinement
MM-GBSAMolecular Mechanics Generalized Born Surface Area
MM-PBSAMolecular Mechanics Poisson–Boltzmann Surface Area
HOMO–LUMOHighest Occupied Molecular Orbital–Lowest Unoccupied Molecular Orbital

References

  1. Almansour, N.M. Triple-Negative Breast Cancer: A Brief Review About Epidemiology, Risk Factors, Signaling Pathways, Treatment and Role of Artificial Intelligence. Front. Mol. Biosci. 2022, 9, 836417. [Google Scholar] [CrossRef]
  2. Siegel, R.L.; Miller, K.D.; Jemal, A. Cancer statistics, 2018. CA Cancer J. Clin. 2018, 68, 7–30. [Google Scholar] [CrossRef]
  3. Bray, F.; Ferlay, J.; Soerjomataram, I.; Siegel, R.L.; Torre, L.A.; Jemal, A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2018, 68, 394–424. [Google Scholar] [CrossRef]
  4. Perou, C.M.; Sorlie, T.; Eisen, M.B.; van de Rijn, M.; Jeffrey, S.S.; Rees, C.A.; Pollack, J.R.; Ross, D.T.; Johnsen, H.; Akslen, L.A.; et al. Molecular portraits of human breast tumours. Nature 2000, 406, 747–752. [Google Scholar] [CrossRef]
  5. Gnant, M.; Harbeck, N.; Thomssen, C. St. Gallen 2011: Summary of the Consensus Discussion. Breast Care 2011, 6, 136–141. [Google Scholar] [CrossRef] [PubMed]
  6. Lehmann, B.D.; Bauer, J.A.; Chen, X.; Sanders, M.E.; Chakravarthy, A.B.; Shyr, Y.; Pietenpol, J.A. Identification of human triple-negative breast cancer subtypes and preclinical models for selection of targeted therapies. J. Clin. Investig. 2011, 121, 2750–2767. [Google Scholar] [CrossRef] [PubMed]
  7. Borgo, C.; D’Amore, C.; Sarno, S.; Salvi, M.; Ruzzene, M. Protein kinase CK2: A potential therapeutic target for diverse human diseases. Signal Transduct. Target. Ther. 2021, 6, 183. [Google Scholar] [CrossRef]
  8. Haidar, S.; Marminon, C.; Aichele, D.; Nacereddine, A.; Zeinyeh, W.; Bouzina, A.; Berredjem, M.; Ettouati, L.; Bouaziz, Z.; Le Borgne, M. QSAR model of indeno[1,2-b]indole derivatives and identification of N-isopentyl-2-methyl-4,9-dioxo-4,9-Dihydronaphtho[2,3-b]furan-3-carboxamide as a potent CK2 inhibitor. Molecules 2019, 25, 97. [Google Scholar] [CrossRef] [PubMed]
  9. Pierre, F.; Chua, P.C.; O’Brien, S.E.; Siddiqui-Jain, A.; Bourbon, P.; Haddach, M.; Michaux, J.; Nagasawa, J.; Schwaebe, M.K.; Stefan, E.; et al. Pre-clinical characterization of CX-4945, a potent and selective small molecule inhibitor of CK2 for the treatment of cancer. Mol. Cell. Biochem. 2011, 356, 37–43. [Google Scholar] [CrossRef]
  10. Su, Y.W.; Huang, W.Y.; Lin, H.C.; Liao, P.N.; Lin, C.Y.; Lin, X.Y.; Huang, S.H.; Chen, Y.T.; Wu, P.S. Silmitasertib, a casein kinase 2 inhibitor, induces massive lipid droplet accumulation and nonapoptotic cell death in head and neck cancer cells. J. Oral. Pathol. Med. 2023, 52, 245–254. [Google Scholar] [CrossRef]
  11. Zanin, S.; Borgo, C.; Girardi, C.; O’Brien, S.E.; Miyata, Y.; Pinna, L.A.; Donella-Deana, A.; Ruzzene, M. Effects of the CK2 Inhibitors CX-4945 and CX-5011 on Drug-Resistant Cells. PLoS ONE 2012, 7, e49193. [Google Scholar] [CrossRef]
  12. Sarno, S.; Reddy, H.; Meggio, F.; Ruzzene, M.; Davies, S.P.; Donella-Deana, A.; Shugar, D.; Pinna, L.A. Selectivity of 4,5,6,7-tetrabromobenzotriazole, an ATP site-directed inhibitor of protein kinase CK2 (‘casein kinase-2’). FEBS Lett. 2001, 496, 44–48. [Google Scholar] [CrossRef]
  13. Cozza, G.; Venerando, A.; Sarno, S.; Pinna, L.A. The Selectivity of CK2 Inhibitor Quinalizarin: A Reevaluation. BioMed Res. Int. 2015, 2015, 734127. [Google Scholar] [CrossRef]
  14. Cozza, G.; Gianoncelli, A.; Bonvini, P.; Zorzi, E.; Pasquale, R.; Rosolen, A.; Pinna, L.A.; Meggio, F.; Zagotto, G.; Moro, S. Urolithin as a Converging Scaffold Linking Ellagic acid and Coumarin Analogues: Design of Potent Protein Kinase CK2 Inhibitors. ChemMedChem 2011, 6, 2273–2286. [Google Scholar] [CrossRef] [PubMed]
  15. Tanoli, Z.; Fernández-Torras, A.; Özcan, U.O.; Kushnir, A.; Nader, K.M.; Gadiya, Y.; Fiorenza, L.; Ianevski, A.; Vähä-Koskela, M.; Miihkinen, M.; et al. Computational Drug Repurposing: Approaches, Evaluation of In Silico Resources and Case Studies. Nat. Rev. Drug. Discov. 2025, 24, 521–542. [Google Scholar] [CrossRef] [PubMed]
  16. Sun, H.; Wu, X.; Xu, X.; Jiang, Z.; Liu, Z.; You, Q. Discovery of novel CK2 leads by cross-docking based virtual screening. Med. Chem. 2014, 10, 628–639. [Google Scholar] [CrossRef] [PubMed]
  17. De Fusco, C.; Brear, P.; Iegre, J.; Georgiou, K.H.; Sore, H.F.; Hyvönen, M.; Spring, D.R. A fragment-based approach leading to the discovery of a novel binding site and the selective CK2 inhibitor CAM4066. Bioorganic Med. Chem. 2017, 25, 3471–3482. [Google Scholar] [CrossRef]
  18. Golub, A.G.; Bdzhola, V.G.; Kyshenia, Y.V.; Sapelkin, V.M.; Prykhod’ko, A.O.; Kukharenko, O.P.; Ostrynska, O.V.; Yarmoluk, S.M. Structure-based discovery of novel flavonol inhibitors of human protein kinase CK2. Mol. Cell. Biochem. 2011, 356, 107–115. [Google Scholar] [CrossRef]
  19. Zhang, J.; Tang, P.; Zou, L.; Zhang, J.; Chen, J.; Yang, C.; He, G.; Liu, B.; Liu, J.; Chiang, C.-M. Discovery of novel dual-target inhibitor of bromodomain-containing protein 4/casein kinase 2 inducing apoptosis and autophagy-associated cell death for triple-negative breast cancer therapy. J. Med. Chem. 2021, 64, 18025–18053. [Google Scholar] [CrossRef]
  20. Sun, H.; Xu, X.; Wu, X.; Zhang, X.; Liu, F.; Jia, J.; Guo, X.; Huang, J.; Jiang, Z.; Feng, T. Discovery and design of tricyclic scaffolds as protein kinase CK2 (CK2) inhibitors through a combination of shape-based virtual screening and structure-based molecular modification. J. Chem. Inf. Model. 2013, 53, 2093–2102. [Google Scholar] [CrossRef]
  21. Patel, S.; Patel, S.; Tulsian, K.; Kumar, P.; Vyas, V.K.; Ghate, M. Design of 2-amino-6-methyl-pyrimidine benzoic acids as ATP competitive casein kinase-2 (CK2) inhibitors using structure- and fragment-based design, docking and molecular dynamic simulation studies. SAR QSAR Environ. Res. 2023, 34, 211–230. [Google Scholar] [CrossRef]
  22. Vyas, V.K.; Ghate, M.; Goel, A. Pharmacophore modeling, virtual screening, docking, and in silico ADMET analysis of protein kinase B (PKB β) inhibitors. J. Mol. Graph. Model. 2013, 42, 17–25. [Google Scholar] [CrossRef] [PubMed]
  23. Anjum, F.; Sulaimani, M.N.; Shafie, A.; Mohammad, T.; Ashraf, G.M.; Bilgrami, A.L.; Alhumaydhi, F.A.; Alsagaby, S.A.; Yadav, D.K.; Hassan, M.I. Bioactive phytoconstituents as potent inhibitors of casein kinase-2: Dual implications in cancer and COVID-19 therapeutics. RSC Adv. 2022, 12, 7872–7882. [Google Scholar] [CrossRef] [PubMed]
  24. Cozza, G. The Development of CK2 Inhibitors: From Traditional Pharmacology to in Silico Rational Drug Design. Pharmaceuticals 2017, 10, 26. [Google Scholar] [CrossRef] [PubMed]
  25. Ul-Haq, Z.; Ashraf, S.; Bkhaitan, M.M. Molecular dynamics simulations reveal structural insights into inhibitor binding modes and mechanism of casein kinase II inhibitors. J. Biomol. Struct. Dyn. 2019, 37, 1120–1135. [Google Scholar] [CrossRef]
  26. Chilin, A.; Battistutta, R.; Bortolato, A.; Cozza, G.; Zanatta, S.; Poletto, G.; Mazzorana, M.; Zagotto, G.; Uriarte, E.; Guiotto, A.; et al. Coumarin as Attractive Casein Kinase 2 (CK2) Inhibitor Scaffold: An Integrate Approach To Elucidate the Putative Binding Motif and Explain Structure–Activity Relationships. J. Med. Chem. 2008, 51, 752–759. [Google Scholar] [CrossRef]
  27. Burley, S.K.; Berman, H.M.; Bhikadiya, C.; Bi, C.; Chen, L.; Di Costanzo, L.; Christie, C.; Dalenberg, K.; Duarte, J.M.; Dutta, S. RCSB Protein Data Bank: Biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy. Nucleic Acids Res. 2019, 47, D464–D474. [Google Scholar] [CrossRef] [PubMed]
  28. Bell, J.; Cao, Y.; Gunn, J.; Day, T.; Gallicchio, E.; Zhou, Z.; Levy, R.; Farid, R. PrimeX and the Schrödinger computational chemistry suite of programs. Int. Tables Crystallogr. 2012, 18, 534–538. [Google Scholar]
  29. Zdrazil, B.; Felix, E.; Hunter, F.; Manners, E.J.; Blackshaw, J.; Corbett, S.; De Veij, M.; Ioannidis, H.; Lopez, D.M.; Mosquera, J.F. The ChEMBL Database in 2023: A drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res. 2024, 52, D1180–D1192. [Google Scholar] [CrossRef]
  30. Bento, A.P.; Hersey, A.; Félix, E.; Landrum, G.; Gaulton, A.; Atkinson, F.; Bellis, L.J.; De Veij, M.; Leach, A.R. An open source chemical structure curation pipeline using RDKit. J. Cheminform. 2020, 12, 51. [Google Scholar] [CrossRef]
  31. Mysinger, M.M.; Carchia, M.; Irwin, J.J.; Shoichet, B.K. Directory of useful decoys, enhanced (DUD-E): Better ligands and decoys for better benchmarking. J. Med. Chem. 2012, 55, 6582–6594. [Google Scholar] [CrossRef]
  32. Friesner, R.A.; Murphy, R.B.; Repasky, M.P.; Frye, L.L.; Greenwood, J.R.; Halgren, T.A.; Sanschagrin, P.C.; Mainz, D.T. Extra precision glide: Docking and scoring incorporating a model of hydrophobic enclosure for protein- ligand complexes. J. Med. Chem. 2006, 49, 6177–6196. [Google Scholar] [CrossRef]
  33. Richardson, E.; Trevizani, R.; Greenbaum, J.A.; Carter, H.; Nielsen, M.; Peters, B. The receiver operating characteristic curve accurately assesses imbalanced datasets. Patterns 2024, 5, 100994. [Google Scholar] [CrossRef]
  34. Probst, D.; Reymond, J.-L. A probabilistic molecular fingerprint for big data settings. J. Cheminform. 2018, 10, 66. [Google Scholar] [CrossRef] [PubMed]
  35. Maestro, S. Maestro; Schrödinger, LLC: New York, NY, USA, 2020. [Google Scholar]
  36. Ntie-Kang, F.; Zofou, D.; Babiaka, S.B.; Meudom, R.; Scharfe, M.; Lifongo, L.L.; Mbah, J.A.; Mbaze, L.M.a.; Sippl, W.; Efange, S.M. AfroDb: A select highly potent and diverse natural product library from African medicinal plants. PLoS ONE 2013, 8, e78085. [Google Scholar] [CrossRef] [PubMed]
  37. Sorokina, M.; Merseburger, P.; Rajan, K.; Yirik, M.A.; Steinbeck, C. COCONUT online: Collection of open natural products database. J. Cheminformatics 2021, 13, 2. [Google Scholar] [CrossRef] [PubMed]
  38. DeLano, W.L. Pymol: An open-source molecular graphics tool. CCP4 Newsl. Protein Crystallogr. 2002, 40, 82–92. [Google Scholar]
  39. Discovery Studio, version 2.1; Accelrys: San Diego, CA, USA, 2008.
  40. Land, H.; Humble, M.S. YASARA: A tool to obtain structural guidance in biocatalytic investigations. In Protein Engineering; Springer: Berlin/Heidelberg, Germany, 2018; pp. 43–67. [Google Scholar]
  41. Wang, J.; Wolf, R.M.; Caldwell, J.W.; Kollman, P.A.; Case, D.A. Development and testing of a general amber force field. J. Comput. Chem. 2004, 25, 1157–1174. [Google Scholar] [CrossRef]
  42. Yin, L.-L.; Xu, J.-K.; Wang, X.-J.; Gao, S.-Q.; Lin, Y.-W. Molecular Dynamics Simulation and Kinetic Study of Fluoride Binding to V21C/V66C Myoglobin with a Cytoglobin-like Disulfide Bond. Int. J. Mol. Sci. 2020, 21, 2512. [Google Scholar] [CrossRef]
  43. Moroz-Omori, E.V.; Huang, D.; Kumar Bedi, R.; Cheriyamkunnel, S.J.; Bochenkova, E.; Dolbois, A.; Rzeczkowski, M.D.; Li, Y.; Wiedmer, L.; Caflisch, A. METTL3 inhibitors for epitranscriptomic modulation of cellular processes. ChemMedChem 2021, 16, 3035–3043. [Google Scholar] [CrossRef]
  44. Xu, W.; Xie, X.-J.; Faust, A.K.; Liu, M.; Li, X.; Chen, F.; Naquin, A.A.; Walton, A.C.; Kishbaugh, P.W.; Ji, J.-Y. All-atomic molecular dynamic studies of Human and Drosophila CDK8: Insights into their kinase domains, the LXXLL Motifs, and drug binding site. Int. J. Mol. Sci. 2020, 21, 7511. [Google Scholar] [CrossRef]
  45. Roe, D.R.; Cheatham, T.E., III. PTRAJ and CPPTRAJ: Software for processing and analysis of molecular dynamics trajectory data. J. Chem. Theory Comput. 2013, 9, 3084–3095. [Google Scholar] [CrossRef]
  46. Chen, F.; Liu, H.; Sun, H.; Pan, P.; Li, Y.; Li, D.; Hou, T. Assessing the performance of the MM/PBSA and MM/GBSA methods. 6. Capability to predict protein–protein binding free energies and re-rank binding poses generated by protein–protein docking. Phys. Chem. Chem. Phys. 2016, 18, 22129–22139. [Google Scholar] [CrossRef] [PubMed]
  47. Khan, A.; Zia, T.; Suleman, M.; Khan, T.; Ali, S.S.; Abbasi, A.A.; Mohammad, A.; Wei, D.-Q. Higher infectivity of the SARS-CoV-2 new variants is associated with K417N/T, E484K, and N501Y mutants: An insight from structural data. J. Cell. Physiol. 2021, 236, 7045–7057. [Google Scholar] [CrossRef]
  48. Khan, A.; Heng, W.; Wang, Y.; Qiu, J.; Wei, X.; Peng, S.; Saleem, S.; Khan, M.; Ali, S.S.; Wei, D.Q. In silico and in vitro evaluation of kaempferol as a potential inhibitor of the SARS-CoV-2 main protease (3CLpro). Phytother. Res. PTR 2021, 35, 2841–2845. [Google Scholar] [CrossRef] [PubMed]
  49. Khan, A.; Chandra Kaushik, A.; Ali, S.S.; Ahmad, N.; Wei, D.-Q. Deep-learning-based target screening and similarity search for the predicted inhibitors of the pathways in Parkinson’s disease. RSC Adv. 2019, 9, 10326–10339. [Google Scholar] [CrossRef] [PubMed]
  50. Sun, Q.; Berkelbach, T.C.; Blunt, N.S.; Booth, G.H.; Guo, S.; Li, Z.; Liu, J.; McClain, J.D.; Sayfutyarova, E.R.; Sharma, S. PySCF: The Python-based simulations of chemistry framework. WIREs Comput. Mol. Sci. 2018, 8, e1340. [Google Scholar] [CrossRef]
  51. Yuan, S.; Chan, H.S.; Hu, Z. Using PyMOL as a platform for computational drug design. WIREs Comput. Mol. Sci. 2017, 7, e1298. [Google Scholar] [CrossRef]
  52. Moriwaki, H.; Tian, Y.-S.; Kawashita, N.; Takagi, T. Mordred: A molecular descriptor calculator. J. Cheminform. 2018, 10, 4. [Google Scholar] [CrossRef]
  53. Halgren, T.A. Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J. Comput. Chem. 1996, 17, 490–519. [Google Scholar] [CrossRef]
  54. Riniker, S.; Landrum, G.A. Better informed distance geometry: Using what we know to improve conformation generation. J. Chem. Inf. Model. 2015, 55, 2562–2574. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Structural coordinates and binding of CX-4945 with CK2. (A) The cartoon representation of CK2 in complex with CX-4945. The bound ligand (CX-4945) is shown in the circle. (B) shows the surface representation of the CK2, while (C,D) show the 3D interaction pattern of CX-4945 with CK2 and a 2D structure of CX-4945.
Figure 1. Structural coordinates and binding of CX-4945 with CK2. (A) The cartoon representation of CK2 in complex with CX-4945. The bound ligand (CX-4945) is shown in the circle. (B) shows the surface representation of the CK2, while (C,D) show the 3D interaction pattern of CX-4945 with CK2 and a 2D structure of CX-4945.
Pharmaceuticals 19 00694 g001
Figure 2. Performance evaluation and chemical space analysis of the CK2 inhibitor classification model. (AC) Receiver operating characteristic (ROC, AUC = 0.748) and precision–recall (AP = 0.423) curves demonstrate moderate predictive performance, with cumulative enrichment confirming prioritized retrieval of active compounds early in the screening process. (DF) Early enrichment factors (EF1–20%) indicate efficient hit recovery within the top-ranked subset of the library. Violin plots show clear separation in score distributions between active and inactive molecules, while PCA of ECFP4 fingerprints reveals broad chemical diversity across the screened space, with higher scores clustering in distinct regions.
Figure 2. Performance evaluation and chemical space analysis of the CK2 inhibitor classification model. (AC) Receiver operating characteristic (ROC, AUC = 0.748) and precision–recall (AP = 0.423) curves demonstrate moderate predictive performance, with cumulative enrichment confirming prioritized retrieval of active compounds early in the screening process. (DF) Early enrichment factors (EF1–20%) indicate efficient hit recovery within the top-ranked subset of the library. Violin plots show clear separation in score distributions between active and inactive molecules, while PCA of ECFP4 fingerprints reveals broad chemical diversity across the screened space, with higher scores clustering in distinct regions.
Pharmaceuticals 19 00694 g002
Figure 3. Interaction pattern for the top hits Anastatin B and 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one. (A) shows the 3D and 2D interaction patterns for Anastatin B, while (B) shows the 3D and 2D interaction patterns for 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one. The figures also show the docking score in kcal/mol. Residue interaction maps depict hydrogen bonds as green dashed lines (green nodes) (Pink in 3D interactions), van der Waals contacts as light green nodes, hydrophobic (π–alkyl/alkyl) interactions as pink or purple nodes, π–π stacking or aromatic interactions as orange or yellow nodes, and solvent-accessible regions as blue halos or outlines.
Figure 3. Interaction pattern for the top hits Anastatin B and 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one. (A) shows the 3D and 2D interaction patterns for Anastatin B, while (B) shows the 3D and 2D interaction patterns for 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one. The figures also show the docking score in kcal/mol. Residue interaction maps depict hydrogen bonds as green dashed lines (green nodes) (Pink in 3D interactions), van der Waals contacts as light green nodes, hydrophobic (π–alkyl/alkyl) interactions as pink or purple nodes, π–π stacking or aromatic interactions as orange or yellow nodes, and solvent-accessible regions as blue halos or outlines.
Pharmaceuticals 19 00694 g003
Figure 4. Interaction pattern for the top hits 6-Methoxyquercetin and Rhein. (A) shows the 3D and 2D interaction patterns for 6-Methoxyquerecetin while (B) shows the 3D and 2D interaction patterns for Rhein. The figures also show the docking score in kcal/mol. Residue interaction maps depict hydrogen bonds as green dashed lines (green nodes) (Pink in 3D interactions), van der Waals contacts as light green nodes, hydrophobic (π–alkyl/alkyl) interactions as pink or purple nodes, π–π stacking or aromatic interactions as orange or yellow nodes, and solvent-accessible regions as blue halos or outlines.
Figure 4. Interaction pattern for the top hits 6-Methoxyquercetin and Rhein. (A) shows the 3D and 2D interaction patterns for 6-Methoxyquerecetin while (B) shows the 3D and 2D interaction patterns for Rhein. The figures also show the docking score in kcal/mol. Residue interaction maps depict hydrogen bonds as green dashed lines (green nodes) (Pink in 3D interactions), van der Waals contacts as light green nodes, hydrophobic (π–alkyl/alkyl) interactions as pink or purple nodes, π–π stacking or aromatic interactions as orange or yellow nodes, and solvent-accessible regions as blue halos or outlines.
Pharmaceuticals 19 00694 g004
Figure 5. Interaction pattern for the top hits, Parietinic acid and aloe emodin acetate. (A) shows the 3D and 2D interaction patterns for Parietinic acid, while (B) shows the 3D and 2D interaction patterns for aloe emodin acetate. The figures also show the docking score in kcal/mol. Residue interaction maps depict hydrogen bonds as green dashed lines (green nodes (Pink in 3D interactions)), van der Waals contacts as light green nodes, hydrophobic (π–alkyl/alkyl) interactions as pink or purple nodes, π–π stacking or aromatic interactions as orange or yellow nodes, and solvent-accessible regions as blue halos or outlines.
Figure 5. Interaction pattern for the top hits, Parietinic acid and aloe emodin acetate. (A) shows the 3D and 2D interaction patterns for Parietinic acid, while (B) shows the 3D and 2D interaction patterns for aloe emodin acetate. The figures also show the docking score in kcal/mol. Residue interaction maps depict hydrogen bonds as green dashed lines (green nodes (Pink in 3D interactions)), van der Waals contacts as light green nodes, hydrophobic (π–alkyl/alkyl) interactions as pink or purple nodes, π–π stacking or aromatic interactions as orange or yellow nodes, and solvent-accessible regions as blue halos or outlines.
Pharmaceuticals 19 00694 g005
Figure 6. Dynamic stability assessment through RMSD calculation for the top hits against CK2. (A) shows the RMSD graphs for the control and Anastatin B–CK2 complexes, (B) shows the RMSD graphs for the control and 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one–CK2 complexes, (C) shows the RMSD graphs for the control and 6-methoxyquercetin–CK2 complexes, (D) shows the RMSD graphs for the control and aloe emodin–CK2 complexes, (E) shows the RMSD graphs for the control and Parietinic acid–CK2 complexes, while (F) shows the RMSD graphs for the control and Rhein–CK2 complexes.
Figure 6. Dynamic stability assessment through RMSD calculation for the top hits against CK2. (A) shows the RMSD graphs for the control and Anastatin B–CK2 complexes, (B) shows the RMSD graphs for the control and 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one–CK2 complexes, (C) shows the RMSD graphs for the control and 6-methoxyquercetin–CK2 complexes, (D) shows the RMSD graphs for the control and aloe emodin–CK2 complexes, (E) shows the RMSD graphs for the control and Parietinic acid–CK2 complexes, while (F) shows the RMSD graphs for the control and Rhein–CK2 complexes.
Pharmaceuticals 19 00694 g006
Figure 7. Structural compactness analysis through Rg calculation for the top hits against CK2. (A) shows the Rg graphs for the control and Anastatin B–CK2 complexes, (B) shows the Rg graphs for the control and 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one–CK2 complexes, (C) shows the Rg graphs for the control and 6-methoxyquercetin–CK2 complexes, (D) shows the Rg graphs for the control and aloe emodin–CK2 complexes, (E) shows the Rg graphs for the control and Parietinic acid–CK2 complexes while (F) shows the Rg graphs for the control and Rhein–CK2 complexes.
Figure 7. Structural compactness analysis through Rg calculation for the top hits against CK2. (A) shows the Rg graphs for the control and Anastatin B–CK2 complexes, (B) shows the Rg graphs for the control and 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one–CK2 complexes, (C) shows the Rg graphs for the control and 6-methoxyquercetin–CK2 complexes, (D) shows the Rg graphs for the control and aloe emodin–CK2 complexes, (E) shows the Rg graphs for the control and Parietinic acid–CK2 complexes while (F) shows the Rg graphs for the control and Rhein–CK2 complexes.
Pharmaceuticals 19 00694 g007
Figure 8. Residue fluctuation indexing in a dynamic environment. (A) shows the RMSF for all the complexes, while (B) shows the highly dynamic regions in the proteins during the simulation. Residue-wise fluctuations were calculated from the equilibrated MD trajectories to evaluate local flexibility and stability across the protein structure. Ligand-bound systems exhibit reduced fluctuations in key functional regions compared to the control, indicating enhanced structural stabilization upon binding. The highlighted regions correspond to flexible domains that play a critical role in conformational adaptability and ligand interaction dynamics.
Figure 8. Residue fluctuation indexing in a dynamic environment. (A) shows the RMSF for all the complexes, while (B) shows the highly dynamic regions in the proteins during the simulation. Residue-wise fluctuations were calculated from the equilibrated MD trajectories to evaluate local flexibility and stability across the protein structure. Ligand-bound systems exhibit reduced fluctuations in key functional regions compared to the control, indicating enhanced structural stabilization upon binding. The highlighted regions correspond to flexible domains that play a critical role in conformational adaptability and ligand interaction dynamics.
Pharmaceuticals 19 00694 g008
Figure 9. Hydrogen bonding graphs for the top hits against CK2. (A) shows the H-bond graphs for the control and Anastatin B–CK2 complexes, (B) shows the H-bond graphs for the control and 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one–CK2 complexes, (C) shows the H-bond graphs for the control and 6-methoxyquercetin–CK2 complexes, (D) shows the H-bond graphs for the control and aloe emodin–CK2 complexes, (E) shows the H-bond graphs for the control and Parietinic acid–CK2 complexes, while (F) shows the H-bond graphs for the control and Rhein–CK2 complexes.
Figure 9. Hydrogen bonding graphs for the top hits against CK2. (A) shows the H-bond graphs for the control and Anastatin B–CK2 complexes, (B) shows the H-bond graphs for the control and 3,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-one–CK2 complexes, (C) shows the H-bond graphs for the control and 6-methoxyquercetin–CK2 complexes, (D) shows the H-bond graphs for the control and aloe emodin–CK2 complexes, (E) shows the H-bond graphs for the control and Parietinic acid–CK2 complexes, while (F) shows the H-bond graphs for the control and Rhein–CK2 complexes.
Pharmaceuticals 19 00694 g009
Figure 10. Machine learning model benchmarking for CK2 inhibitor prediction. Comparison of train–test performance (R2), generalization gap (ΔR2), and pIC50 value distribution across 2D, 3D, fingerprint, and combined feature sets. Stacking ensemble models achieved the highest predictive performance, with the combined feature model reaching a test R2 ≈ of 0.69 and consistent cross-validation performance, indicating robust generalization across CK2 inhibitor datasets.
Figure 10. Machine learning model benchmarking for CK2 inhibitor prediction. Comparison of train–test performance (R2), generalization gap (ΔR2), and pIC50 value distribution across 2D, 3D, fingerprint, and combined feature sets. Stacking ensemble models achieved the highest predictive performance, with the combined feature model reaching a test R2 ≈ of 0.69 and consistent cross-validation performance, indicating robust generalization across CK2 inhibitor datasets.
Pharmaceuticals 19 00694 g010
Table 1. Extra precision-based predicted top leads from natural product chemical space. The docking scores are given in kcal/mol. The table shows the 2D structures of the selected top hits and the control (CX-4945) ranked using docking scores. The interacting residues involved in hydrogen bonding, hydrophobic, and other interactions are also mentioned.
Table 1. Extra precision-based predicted top leads from natural product chemical space. The docking scores are given in kcal/mol. The table shows the 2D structures of the selected top hits and the control (CX-4945) ranked using docking scores. The interacting residues involved in hydrogen bonding, hydrophobic, and other interactions are also mentioned.
S. No2D StructureNameHydrogen Bonding ResiduesHydrophobic/Other InteractionsDocking Scores
1.Pharmaceuticals 19 00694 i001CX-4945Arg47, Lys49, Tyr50, Lys68, Ser51, Val116 −9.57
2.Pharmaceuticals 19 00694 i002Parietinic AcidLys68, Glu114 and Val116Leu45, Val53, Val66, Lys68, Ile95, Phe113, Met163 and Ile174−11.65
3.Pharmaceuticals 19 00694 i003RheinLys68, Glu114, Val116 and Asp175Val53, Val66, Lys68, Ile95, Met163 and Ile174−12.36
4.Pharmaceuticals 19 00694 i0046-methoxy
quercetin
Leu45, Lys68, Glu114, Val116 and Asp175Val53, Val66, Lys68, Ile95, Met163 and Ile174−11.60
5.Pharmaceuticals 19 00694 i0053,4,8,9,10-pentahydroxy-dibenzo-[b,d]pyran-6-oneLys68, Val116 and Asp175Gly46, Val53, Val66, Ile95, Met163 and Ile174−12.36
6.Pharmaceuticals 19 00694 i006Anastatin BArg47, Lys68, Val116 and Asp175Gly46, Val53, Val66, Ile95, Met163 and Ile174−13.12
7.Pharmaceuticals 19 00694 i007Aloe EmodinGlu114, Val116, Asn118 and Asp175Leu45, Val53, Val66, Lys68, Ile95, Phe113, Met163 and Ile174−11.73
Table 2. MM-PBSA binding free energy decomposition of the control and selected top-hit complexes calculated over different simulation intervals (1–10 ns, 11–30 ns, and 186–200 ns). All energy values are reported in kcal/mol as mean ± standard error.
Table 2. MM-PBSA binding free energy decomposition of the control and selected top-hit complexes calculated over different simulation intervals (1–10 ns, 11–30 ns, and 186–200 ns). All energy values are reported in kcal/mol as mean ± standard error.
MM/PBSAControlAnastatin BAloe EmodinParietinic Acid3–4 Penta6-MQRhein
1–10 nsvdWaals−33.58 ± 0.15−42.78 ± 0.32−39.02 ± 0.13−29.14 ± 0.11−24.99 ± 0.19−38.41 ± 0.15−27.27 ± 0.29
EEL−21.62 ± 0.31−9.12 ± 0.21−5.97 ± 0.21−122.47 ± 0.53−9.12 ± 0.18−13.47 ± 0.20−175.31 ± 1.46
EPB42.56 ± 0.4028.53 ± 0.2921.50 ± 0.19137.44 ± 0.5120.23 ± 0.2831.00 ± 0.20190.65 ± 1.11
ENPOLAR−2.87 ± 0.00−3.46 ± 0.29−3.59 ± 0.00−3.11 ± 0.07−2.48 ± 0.13−3.07 ± 0.06−2.86 ± 0.011
EDSPIDER0.00 ± 0.000.00 ± 0.000.00 ± 0.000.00 ± 0.000.00 ± 0.000.00 ± 0.000.00 ± 0.00
Delta G Gas−55.21 ± 0.42−51.91 ± 0.26−44.99 ± 0.28−151.62 ± 0.55−34.12 ± 0.29−51.89 ± 0.26−202.61 ± 1.26
Delta G Solv39.68 ± 0.4025.06 ± 0.2817.90 ± 0.19134.33 ± 0.5117.75 ± 0.2727.93 ± 0.20187.78 ± 1.12
Delta Total−15.52 ± 0.16−26.85 ± 0.20−27.09 ± 0.22−17.28 ± 0.10−16.37 ± 0.15−23.95 ± 0.18−14.82 ± 0.37
11–30 nsvdWaals−34.85 ± 0.13−43.68 ± 0.14−38.83 ± 0.08−30.65 ± 0.10−21.21 ± 0.18−37.02 ± 0.12−24.19 ± 0.15
EEL−24.13 ± 0.25−8.22 ± 0.10−4.51 ± 0.12−125.01 ± 0.61−7.9 ± 0.12−12.83 ± 0.11−169.14 ± 1.13
EPB45.71 ± 0.3228.04 ± 0.1420.29 ± 0.11141.30 ± 0.5517.11 ± 0.1929.75 ± 0.15181.00 ± 1.06
ENPOLAR−2.86 ± 0.00−3.45 ± 0.00−3.57 ± 0.00−3.22 ± 0.00−2.16 ± 0.01−3.05 ± 0.04−2.67 ± 0.00
EDSPIDER0.00 ± 0.000.00 ± 0.000.00 ± 0.000.00 ± 0.000.00 ± 0.000.00 ± 0.000.00 ± 0.00
Delta G Gas−58.99 ± 0.35−51.91 ± 0.15−43.34 ± 0.17−155.66 ± 0.59−29.12 ± 0.26−49.85 ± 0.17−193.35 ± 1.09
Delta G Solv42.85 ± 0.1224.59 ± 0.1416.72 ± 0.11138.08 ± 0.5514.95 ± 0.1826.69 ± 0.15178.33 ± 1.06
Delta Total−16.14 ± 0.12−27.32 ± 0.10−26.62 ± 0.13−17.58 ± 0.13−14.17 ± 0.12−23.16 ± 0.10−15.01 ± 0.16
186–200 nsvdWaals−32.45 ± 0.10−43.89 ± 0.09−36.42 ± 0.06−32.43 ± 0.9−24.05 ± 0.07−31.57 ± 0.06−13.59 ± 0.17
EEL−19.14 ± 0.16−9.14 ± 0.09−7.13 ± 0.07−160.87 ± 0.71−13.64 ± 0.18−2.69 ± 0.07−205.86 ± 1.43
EPB37.88 ± 0.2328.63 ± 0.1222.94 ± 0.09174.16 ± 0.6525.90 ± 0.2218.14 ± 0.09206.52 ± 1.44
ENPOLAR−2.80 ± 0.00−3.44 ± 0.12−3.52 ± 0.00−3.19 ± 0.00−2.36 ± 0.03−2.95 ± 0.02−1.58 ± 0.01
EDSPIDER0.00 ± 0.000.00 ± 0.000.00 ± 0.000.00 ± 0.000.00 ± 0.000.00 ± 0.000.00 ± 0.00
Delta G Gas−51.59 ± 0.23−53.04 ± 0.14−43.56 ± 0.10−193.31 ± 0.68−37.69 ± 0.22−34.26 ± 0.09−219.47 ± 1.49
Delta G Solv35.08 ± 0.2325.84 ± 0.0819.42 ± 0.09170.97 ± 0.6523.54 ± 0.2215.19 ± 0.08204.94 ± 1.43
Delta Total−16.51 ± 0.08−27.84 ± 0.08−24.13 ± 0.08−22.34 ± 0.11−14.15 ± 0.07−19.07 ± 0.07−14.53 ± 0.14
Table 3. MM-GBSA binding free energy decomposition of the control and selected top-hit complexes calculated over different simulation intervals (1–10 ns, 11–30 ns, and 186–200 ns). All energy values are reported in kcal/mol as mean ± standard error.
Table 3. MM-GBSA binding free energy decomposition of the control and selected top-hit complexes calculated over different simulation intervals (1–10 ns, 11–30 ns, and 186–200 ns). All energy values are reported in kcal/mol as mean ± standard error.
MM/GBSAControlAnastatin BAloe EmodinParietinic Acid3–4 Penta6-MQRhein
1–10 nsvdWaals−33.58 ± 0.15−42.78 ± 0.32−39.02 ± 0.13−29.14 ± 0.11−24.99 ± 0.19−38.41 ± 0.15−27.27 ± 0.29
EEL−21.62 ± 0.31−9.12 ± 0.21−5.97 ± 0.21−122.47 ± 0.53−9.12 ± 0.18−13.47 ± 0.20−175.31 ± 1.46
EGB43.05 ± 0.2927.53 ± 0.2916.82 ± 0.19139.67 ± 0.5124.35 ± 0.1932.74 ± 0.16189.07 ± 1.29
ESURF−4.13 ± 0.01−4.92 ± 0.25−5.19 ± 0.01−4.20 ± 0.07−3.14 ± 0.13−4.71 ± 0.12−3.91 ± 0.03
Delta G Gas−55.21 ± 0.42−51.91 ± 0.26−44.99 ± 0.28−151.62 ± 0.55−34.12 ± 0.29−51.89 ± 0.26−202.61 ± 1.26
Delta G Solv39.68 ± 0.4022.67 ± 0.1511.64 ± 0.15135.15 ± 0.5121.20 ± 0.2728.02 ± 0.16185.16 ± 1.31
Delta Total−16.29 ± 0.16−29.23 ± 0.20−33.36 ± 0.17−16.15 ± 0.89−12.91 ± 0.15−23.86 ± 0.17−17.44 ± 0.66
11–30 nsvdWaals−34.85 ± 0.13−43.68 ± 0.14−38.83 ± 0.08−30.65 ± 0.10−21.21 ± 0.18−37.02 ± 0.12−24.19 ± 0.15
EEL−24.13 ± 0.25−8.22 ± 0.10−4.51 ± 0.12−125.01 ± 0.61−7.9 ± 0.12−12.83 ± 0.11−169.14 ± 1.32
EGB45.71 ± 0.3227.08 ± 0.1415.82 ± 0.92143.20 ± 0.5621.41 ± 0.1931.50 ± 0.10181.31 ± 1.04
ESURF−4.12 ± 0.08−4.96 ± 0.01−5.17 ± 0.07−4.43 ± 0.01−2.69 ± 0.01−4.54 ± 0.13−3.46 ± 0.18
Delta G Gas−58.99 ± 0.35−51.91 ± 0.15−43.34 ± 0.17−155.66 ± 0.59−29.12 ± 0.26−49.85 ± 0.17−193.35 ± 1.09
Delta G Solv41.14 ± 0.2322.11 ± 0.0910.65 ± 0.88138.77 ± 0.5618.71 ± 0.1522.96 ± 0.10177.85 ± 1.04
Delta Total−17.85 ± 0.14−27.79 ± 0.09−32.69 ± 0.11−16.89 ± 0.97−10.40 ± 0.12−22.89 ± 0.11−15.49 ± 0.12
186–200 nsvdWaals−32.45 ± 0.10−43.89 ± 0.09−36.42 ± 0.06−32.43 ± 0.9−24.05 ± 0.07−31.57 ± 0.06−13.59 ± 0.17
EEL−19.14 ± 0.16−9.14 ± 0.09−7.13 ± 0.07−160.87 ± 0.71−13.64 ± 0.18−2.69 ± 0.07−205.86 ± 1.43
EGB38.14 ± 0.1728.02 ± 0.1220.91 ± 0.65174.92 ± 0.6429.12 ± 0.2220.09 ± 0.65208.16 ± 1.44
ESURF−3.79 ± 0.00−4.96 ± 0.12−4.83 ± 0.08−4.39 ± 0.08−3.11 ± 0.06−4.25 ± 0.06−1.83 ± 0.01
Delta G Gas−51.59 ± 0.23−53.04 ± 0.14−43.56 ± 0.10−193.31 ± 0.68−37.69 ± 0.22−34.26 ± 0.09−219.47 ± 1.49
Delta G Solv34.34 ± 0.1723.06 ± 0.0816.08 ± 0.69170.52 ± 0.6526.01 ± 0.1915.83 ± 0.63206.32 ± 1.43
Delta Total−17.25 ± 0.08−27.97 ± 0.07−27.47 ± 0.07−22.78 ± 0.91−11.68 ± 0.63−18.42 ± 0.05−13.15 ± 0.15
Table 4. Summary of model performance metrics for different feature sets. The table reports the coefficient of determination (R2) for training, test, and cross-validation (CV) datasets, along with the difference between training and test performance (ΔR2). Values are presented as mean ± standard deviation (SD) across cross-validation folds.
Table 4. Summary of model performance metrics for different feature sets. The table reports the coefficient of determination (R2) for training, test, and cross-validation (CV) datasets, along with the difference between training and test performance (ΔR2). Values are presented as mean ± standard deviation (SD) across cross-validation folds.
Feature SetModelTrain R2Test R2ΔR2CV R2 (±SD)
2DRandom Forest0.8000.6520.1480.633 ± 0.036
Gradient Boosting0.8070.6530.1550.639 ± 0.044
HistGB0.8760.6420.2350.624 ± 0.064
ExtraTrees0.8840.6440.2400.620 ± 0.051
SVM0.8030.6030.2000.626 ± 0.024
KNN0.7250.5550.1700.546 ± 0.055
Stacking0.8540.6720.1820.664 ± 0.044
3DRandom Forest0.7460.4590.2870.460 ± 0.051
Extra Trees0.8840.5430.3420.445 ± 0.079
Stacking0.8350.5360.2990.530 ± 0.034
FPRandom Forest0.7990.6680.1300.666 ± 0.041
Gradient Boosting0.8030.6640.1390.664 ± 0.043
HistGB0.8750.6800.1950.656 ± 0.063
ElasticNet0.8010.6750.1260.656 ± 0.036
Stacking0.8410.6900.1510.693 ± 0.0036
2D + 3DRandom Forest0.8530.6500.2020.628 ± 0.046
Gradient Boosting0.8060.6470.1590.636 ± 0.047
Stacking0.8580.6720.1860.669 ± 0.046
2D + FPRandom Forest0.8400.6720.1680.661 ± 0.042
Gradient Boosting0.8180.6880.1300.674 ± 0.035
Stacking0.8400.6910.1500.689 ± 0.038
3D + FPRandom Forest0.8440.6730.1710.663 ± 0.042
Gradient Boosting0.8120.6680.1440.669 ± 0.038
Stacking0.8450.6880.1570.694 ± 0.039
CombinedRandom Forest0.8170.6810.1360.667 ± 0.040
Gradient Boosting0.8230.6840.1390.694 ± 0.039
Stacking0.8510.6920.1600.690 ± 0.039
GNNGIN0.0400.0200.020
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Khan, A.; Alshabrmi, F.M.; Mohammad, A.; Shkoor, M.; Al-Zoubi, R.M.; Ming, L.C.; Agouni, A. An Integrative Computational Pipeline for CK2 Inhibitor Discovery in Triple-Negative Breast Cancer Using Virtual Screening, Molecular Dynamics, Machine Learning, and Density Functional Theory. Pharmaceuticals 2026, 19, 694. https://doi.org/10.3390/ph19050694

AMA Style

Khan A, Alshabrmi FM, Mohammad A, Shkoor M, Al-Zoubi RM, Ming LC, Agouni A. An Integrative Computational Pipeline for CK2 Inhibitor Discovery in Triple-Negative Breast Cancer Using Virtual Screening, Molecular Dynamics, Machine Learning, and Density Functional Theory. Pharmaceuticals. 2026; 19(5):694. https://doi.org/10.3390/ph19050694

Chicago/Turabian Style

Khan, Abbas, Fahad M. Alshabrmi, Anwar Mohammad, Mohanad Shkoor, Raed M. Al-Zoubi, Long Chiau Ming, and Abdelali Agouni. 2026. "An Integrative Computational Pipeline for CK2 Inhibitor Discovery in Triple-Negative Breast Cancer Using Virtual Screening, Molecular Dynamics, Machine Learning, and Density Functional Theory" Pharmaceuticals 19, no. 5: 694. https://doi.org/10.3390/ph19050694

APA Style

Khan, A., Alshabrmi, F. M., Mohammad, A., Shkoor, M., Al-Zoubi, R. M., Ming, L. C., & Agouni, A. (2026). An Integrative Computational Pipeline for CK2 Inhibitor Discovery in Triple-Negative Breast Cancer Using Virtual Screening, Molecular Dynamics, Machine Learning, and Density Functional Theory. Pharmaceuticals, 19(5), 694. https://doi.org/10.3390/ph19050694

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop