Identification of DNA Methyltransferase-1 Inhibitor for Breast Cancer Therapy through Computational Fragment-Based Drug Design

Epimutation by DNA Methyltransferase 1 (DNMT1), an epigenetic regulator enzyme, may lead to the proliferation of breast cancer. In this report, 168,686 natural products from the PubChem database were screened and modified by in silico method to acquire the potential inhibitor of DNMT1. The initial screening of PubChem natural products using Lipinski’s and Veber’s rules of three and toxic properties have resulted in 2601 fragment candidates. Four fragments from pharmacophore-based molecular docking simulation were modified by utilizing FragFP and the Lipinski’s and Veber’s rules of five, and resulted in 51,200 ligands. The toxicological screening collected 13,563 ligands for a series of pharmacophore-based molecular docking simulations to sort out the modified ligands, which had the better binding activity and interactions to DNMT1 compared to the standards, SAH, SAM, and SFG. This step resulted in five ligand candidates, namely C-7756, C-5769, C-1723, C-2129, and C-2140. The ADME-Tox properties prediction showed that the selected ligands are generally better than standards in terms of druglikeness, GI absorption, and oral bioavailability. C-7756 exhibited a stronger affinity to DNMT1 as well as better ADME-Tox properties compared to the other ligands.


Introduction
Cancer is a chronic disease characterized by uncontrolled cell growth that can spread to tissues and other organs in the body [1]. Breast cancer is one type of cancer with many cases of death among female patients and continues to be a global medical issue. Even though the number of major medical advances have improved the treatment of primary breast cancer [2], it still contributes to 11.6% of the total cancer incidence burden worldwide, with approximately 2.1 million people suffering from breast cancer in 2018 [3]. In Indonesia, breast cancer continues to be the most common malignancy in women, with an incidence rate of 30.9% per total number of 188,231 new cases in 2018 [4].
DNA Methyltransferase 1 (DNMT1) is an epigenetic regulator enzyme responsible for forming and maintaining DNA methylation patterns [5]. The epigenetic modification through methylation at cytosine residues of DNA plays essential role in regulating gene expression without altering the original DNA sequence [6]. The DNMT1 protein consists of five domains, which are: replication foci targeting sequence (RTFS) domain, CXXC zinc finger domain, bromo adjacent homology 1 (BAH1) domain, BAH2 domain, and MTase domains (Figure 1) [7]. During DNA methylation, the transfer of the adenosyl-Lmethionine (SAM) methyl group (-CH3) to the 5 -cytosine position in the DNA sequence forming 5-methylcytosine (5mC) and S-adenosyl-L-homocysteine (SAH), occurs in the MTase domain [8]. In the normal cells, epigenetic regulation dictates the expression of oncogenes, which encourage the formation of cancer cells, and tumor suppressor genes that regulate the growth as well as the life cycle of a cell. Epimutation or epigenetic alteration is a change in the DNA methylation pattern occurring in various types of cancer where there is a massive upregulation in oncogene expression and a downregulation in tumor suppressor gene expression [9].
Epimutation is characterized by a decrease in the amount of global methylated DNA and an increase in DNA methylation in CpG (CG site) islands, a region containing a huge number of CpG dinucleotide repeats [10] or regions of DNA where a cytosine nucleotide is followed by a guanine in a linear sequence of bases along its 5′-3′ direction. CpG islands are DNA methylation regions in promoters known to regulate gene expression through transcriptional silencing of corresponding genes [11]. CpG islands usually extend for 300-3000 base pairs in mammalian genomes and are located within or close to approximately 50% of human promoters [12,13]. Generally, this area is not located in the gene promoter, and the CpG island is not usually methylated [14]. One of the leading causes of breast cancer is an increase in DNMT1 activity, which leads to the alteration in DNA methylation patterns and the increase in DNA methylation in CpG islands [15].
The natural product compound is known as a valuable source of medicines because of its bioactivity [16]. In addition to having an anticancer activity, the natural product compounds also pose an advantage as a drug due to their good bioavailability and therapeutic activity [16,17]. The natural product compounds such as epigallocatechin-3-gallate (EGCG), catechin, and quercetin exhibit a plausible activity as an inhibitor of DNMT1, resulting in DNA demethylation, which reactivates the tumor suppressor gene expression, thereby reducing the cancer cell growth rate [18,19]. Therefore, developing drug candidates as DNMT1 inhibitors is a potential strategy for breast cancer treatment.
The in silico method or computer-aided drug discovery and development is a rapidly developing field since it reduces the cost and development time in drug development [20]. In the normal cells, epigenetic regulation dictates the expression of oncogenes, which encourage the formation of cancer cells, and tumor suppressor genes that regulate the growth as well as the life cycle of a cell. Epimutation or epigenetic alteration is a change in the DNA methylation pattern occurring in various types of cancer where there is a massive upregulation in oncogene expression and a downregulation in tumor suppressor gene expression [9].
Epimutation is characterized by a decrease in the amount of global methylated DNA and an increase in DNA methylation in CpG (CG site) islands, a region containing a huge number of CpG dinucleotide repeats [10] or regions of DNA where a cytosine nucleotide is followed by a guanine in a linear sequence of bases along its 5 -3 direction. CpG islands are DNA methylation regions in promoters known to regulate gene expression through transcriptional silencing of corresponding genes [11]. CpG islands usually extend for 300-3000 base pairs in mammalian genomes and are located within or close to approximately 50% of human promoters [12,13]. Generally, this area is not located in the gene promoter, and the CpG island is not usually methylated [14]. One of the leading causes of breast cancer is an increase in DNMT1 activity, which leads to the alteration in DNA methylation patterns and the increase in DNA methylation in CpG islands [15].
The natural product compound is known as a valuable source of medicines because of its bioactivity [16]. In addition to having an anticancer activity, the natural product compounds also pose an advantage as a drug due to their good bioavailability and therapeutic activity [16,17]. The natural product compounds such as epigallocatechin-3-gallate (EGCG), catechin, and quercetin exhibit a plausible activity as an inhibitor of DNMT1, resulting in DNA demethylation, which reactivates the tumor suppressor gene expression, thereby reducing the cancer cell growth rate [18,19]. Therefore, developing drug candidates as DNMT1 inhibitors is a potential strategy for breast cancer treatment.
The in silico method or computer-aided drug discovery and development is a rapidly developing field since it reduces the cost and development time in drug development [20]. The molecular docking and dynamic simulations have been regularly utilized and developed to analyze interactions, affinity, and stability of a ligand targeting other biomolecules [21]. On the other hand, the in silico method also makes the characterization The molecular docking and dynamic simulations have been regularly utilized and developed to analyze interactions, affinity, and stability of a ligand targeting other biomolecules [21]. On the other hand, the in silico method also makes the characterization of absorption, distribution, metabolism, excretion, and dan toxicity (ADMET) of studied compounds possible [22].

Pre-Docking Preparation
The 3D structure of DNMT1 proteins was prepared by using MOE 2014.09. Small molecules such as zinc and sulfate ions from protein targets were removed. Then, the potential setup was arranged with Amber10: EHT forcefield and R-field solvation. The "LigX" protocol was performed with a tether strength of 100,000, an RMS gradient value

Pre-Docking Preparation
The 3D structure of DNMT1 proteins was prepared by using MOE 2014.09. Small molecules such as zinc and sulfate ions from protein targets were removed. Then, the potential setup was arranged with Amber10: EHT forcefield and R-field solvation. The "LigX" protocol was performed with a tether strength of 100,000, an RMS gradient value of 0.05 kcal/molÅ, and the unchecked "Allow ASN/GLN/HIS 'Flips' in Protonate 3D" option while the rest parameters were set in default.
The natural product compounds from the PubChem database were chosen as a fragment library. The fragment library was also subjected to the initial toxicological screening and the structure optimization using DataWarrior 4.5.2 and MOE 2014.09, respectively. All fragments were subjected to initial pharmacological screening of toxicity and a Rule of 3 (RO3) filter [31]. Structure optimization took place by arranging the potential setup to the MMFF94x force field with R-field solvation. Then, by using the selected "Presence Existing Chirality" and RMS gradient 0.001 kcal/mol/Å 2 parameters, the fragment library was subjected to the default "Wash" and "Energy Minimization" process. The standard ligands, namely SAM, SAH, and SNF, were also subjected to the same structure optimization process.
The last step in pre-docking preparation was to determine the pharmacophore mapping of DNMT1. It was created using the standard procedure of the Protein-Ligand Interaction Fingerprints (PLIF) method in MOE 2014.09 software. Seven DNMT1 protein structures with their respective ligands in the binding sites were superposed. Proteinligand interactions were mapped to find the conserved interaction, which served as the pharmacophore feature. As a result, the pharmacophore query was established at the end of the process.

Molecular Docking Simulation of the Fragment
The screened fragments were subjected to two-steps pharmacophore-based rigid molecular docking simulation against DNMT1 protein using MOE 2014.09 software. The "Rigid Receptor" protocol with pharmacophore pose prediction was utilized with the former simulation using 30 repetitions, while the latter used 100 repetitions. In both simulations, the rescoring of simulated poses was done using a London dG scoring function followed by a pose refinement by Force field algorithm along with GBVI WSA dG rescoring function in which the root-mean-square deviation (RMSD) was calculated. Every fragment with an RMSD value lower than 2.0 Å, attached inside the binding site, and forming significant H-bonds was chosen as the fragment candidates.

Fragment Growing
In this research, the DataWarrior 4.5.2 software with the fragFP and the Lipinski's and Veber's rules of fives (molecular weight ≤ 500 Da, −0.5 ≤ Log P ≤ 5.6, H acceptor ≤ 10, H donor ≤ 5, TPSA ≤ 140 Å 2 , rotatable bond count ≤ 10) were used as the parameters to generate ligands from the previously screened fragments. Afterwards, any ligands deemed to be identified as potentially mutagenic, tumorigenic, irritant, or harmful for the reproductive system and had a druglikeness score below 0 were omitted from further analysis.

Molecular Docking Simulation of the Ligand
The screened ligands were subjected to the same structure optimization as the fragment library. Then, the series of molecular docking simulations by using MOE 2014.09 software of the screened ligands and standard compounds followed similar steps to the fragment library. Then, the series of molecular docking simulations using "Virtual Screening", "Rigid Docking", and "Induced Fit" protocols were carried out sequentially along with pharmacophore pose prediction, London dG scoring function, and the retain value of 1, 30, and 100 repetitions, respectively. Forcefield AMBER 10: EHT algorithm and GBVI WSA dG rescoring functions were selected as the parameter for pose refinement and rescoring processes. The standard ligands were also subjected to the same simulation to compare the docking protocol employed to predict the binding orientation of the SAH, SAM, and SFG in the DNMT1 binding pocket. In each simulation, all the ligands were ranked based on ∆Gbinding energy and RMSD values. Ligands with a ∆Gbinding energy higher than standard ligands and RMSD higher than 2.0 Å were excluded.
The pharmacophore query was generated from six DNMT1 protein structures, namely 3AV5, 3AV6, 3PTA, 4WXX, AHH92517.1a, and AHH92517.1b along with SAH, SAM, and SFG as the reference compounds of protein-ligand interactions. The PLIF protocol generated three queries: HydA, Don&Acc&ML, and Acc&ML. Then, the reference molecules were reattached to the query by utilizing the "Pharmacophore Searching" feature on MOE 2014.09 to validate the generated pharmacophore. Only the combination of HydA and Don&Acc&ML was able to detect SAH, SAM, and SFG. Therefore, this query was chosen as the pharmacophore model for the molecular docking simulation ( Figure 3).  The rigid docking, which simulates the "lock and key" interaction of protein and ligand, was used as the protocol for a two-step molecular docking simulation to identify a potential fragment for further modification. From each molecular docking simulation, the 2.0 Å cut off for RMSD value was selected because the docking results with RMSD > 2.0 Å were identified to be less reproducible in subsequent analyses [35]. The RMSD is as measure of similarity between the real ligand position in the receptor and the computed position of the docking ligand [36]. In molecular docking simulations, the RMSD value is defined to compare the docked conformation with the reference conformation or with other docked conformations [37]. A ligand-receptor molecular docking simulation with an RMSD value below 2 Å is considered as a conformation with a high docking accuracy [36]. The simulation was performed with the retain value of 30 and 100 repetitions, respectively. The former simulation eliminated 2146 out of 2601 fragments, while the latter retained 287 out of 455 fragments. From 287 fragments which passed the molecular docking simulations, four fragments were chosen, namely 3-[(2R)-4-propyl-2-morpholinyl]phenol, 3-(2-methyl-2-azabicyclo[3.2.1]oct-5-yl)phenol, α-quinidine, and (R)-Nmethylsalsolinol ( Figure 4). These fragments were chosen because they attached to the protein binding site following the feature of the pharmacophore query and formed the highest number of hydrogen bonds compared to the other fragments. The rigid docking, which simulates the "lock and key" interaction of protein and ligand, was used as the protocol for a two-step molecular docking simulation to identify a potential fragment for further modification. From each molecular docking simulation, the 2.0 Å cut off for RMSD value was selected because the docking results with RMSD > 2.0 Å were identified to be less reproducible in subsequent analyses [35]. The RMSD is as measure of similarity between the real ligand position in the receptor and the computed position of the docking ligand [36]. In molecular docking simulations, the RMSD value is defined to compare the docked conformation with the reference conformation or with other docked conformations [37]. A ligand-receptor molecular docking simulation with an RMSD value below 2 Å is considered as a conformation with a high docking accuracy [36]. The simulation was performed with the retain value of 30 and 100 repetitions, respectively. The former simulation eliminated 2146 out of 2601 fragments, while the latter retained 287 out of 455 fragments. From 287 fragments which passed the molecular docking simulations, four fragments were chosen, namely 3-[(2R)-4-propyl-2-morpholinyl]phenol, 3-(2-methyl-2azabicyclo[3.2.1]oct-5-yl)phenol, α-quinidine, and (R)-N-methylsalsolinol ( Figure 4). These fragments were chosen because they attached to the protein binding site following the feature of the pharmacophore query and formed the highest number of hydrogen bonds compared to the other fragments.

Fragment Growing
Fragment growing is the process of constructing a reasonable molecular structure around a fragment. It typically starts with a single fragment and proceeds by expanding the molecular structure to probe further parts of the protein binding site in order to increase its affinity [38]. The fragment growing method from the DataWarrior software was able to generate 12,800 ligands from each selected fragment, totaling about 51,200 ligands. However, only 13,563 ligands were cleared for the next phase analysis after the druglikeness and toxic property screening (Table 2). Table 2. Ligands generated from fragment growing method using DataWarrior software. The same software was also used to perform toxicological screening, which includes druglikeness > 0, and toxic properties (no mutagenic, tumorigenic, irritant, and reproductive effect).

Fragment
Number

Molecular Docking Simulation of the Ligand
The 13,563 ligands and three standards underwent three rounds of molecular docking simulation against the DNMT1 MTase binding pocket using MOE 2014.09 software. A "Virtual Screening" protocol was used as the first protocol in the sequence of docking simulation to instantaneously identify ligands that can fit into the binding pocket and its

Fragment Growing
Fragment growing is the process of constructing a reasonable molecular structure around a fragment. It typically starts with a single fragment and proceeds by expanding the molecular structure to probe further parts of the protein binding site in order to increase its affinity [38]. The fragment growing method from the DataWarrior software was able to generate 12,800 ligands from each selected fragment, totaling about 51,200 ligands. However, only 13,563 ligands were cleared for the next phase analysis after the druglikeness and toxic property screening (Table 2). Table 2. Ligands generated from fragment growing method using DataWarrior software. The same software was also used to perform toxicological screening, which includes druglikeness > 0, and toxic properties (no mutagenic, tumorigenic, irritant, and reproductive effect). determined pharmacophore. The second docking simulation utilizes a "Rigid Receptor" protocol where the ligands can move freely in the rigid binding pocket to find the optimal binding pose. In the last docking simulation, both the protein binding pockets and the ligands were moved flexibly in a simultaneous manner to probe for the optimum proteinligand conformation through the "Induced Fit" protocol. In each step, ligands, which had RMSD value lower than 2.0 and a Gibbs binding (∆Gbinding) energy lower than the standards, passed the respective simulations ( Figure 5). The ∆Gbinding energy was also used as a parameter in addition to the RMSD value because it represented ligand affinity to the target protein and the spontaneity of the protein-ligand complex formation. Thus, in this research, ligands with low ∆Gbinding energy were said to have a better affinity and reacted more spontaneously to DNTM1. determined pharmacophore. The second docking simulation utilizes a "Rigid Receptor" protocol where the ligands can move freely in the rigid binding pocket to find the optimal binding pose. In the last docking simulation, both the protein binding pockets and the ligands were moved flexibly in a simultaneous manner to probe for the optimum proteinligand conformation through the "Induced Fit" protocol. In each step, ligands, which had RMSD value lower than 2.0 and a Gibbs binding (∆Gbinding) energy lower than the standards, passed the respective simulations ( Figure 5). The ∆Gbinding energy was also used as a parameter in addition to the RMSD value because it represented ligand affinity to the target protein and the spontaneity of the protein-ligand complex formation. Thus, in this research, ligands with low ∆Gbinding energy were said to have a better affinity and reacted more spontaneously to DNTM1.  At the end of the simulation, 22 ligands had successfully passed three rounds the molecular docking simulations. These ligands were ranked based on their ∆Gbinding energy. Then, five top ligands, namely C-7756, C-5769, C-1723, C-2129, and C-2140, were considered to have a high potency as a DNMT1 MTase inhibitor (Table 3). C-7756, C-5769, and C-1723 were grown from 3-(2-methyl-2-azabicyclo[3.2.1]oct-5-yl)phenol, while C-2129 and C-2140 were grown from (R)-N-methylsalsolinol. Figure 6 represent the structure of five potential ligands generated from this research.

Number of Ligands
are marked in a green dot while the standards are marked with red dots. About 1133, 135, and 22 ligands, which had a ΔGbinding energy lower than standards and an RMSD value lower than 2.0 Å, passed the respective virtual screening, rigid, and flexible docking simulations (blue box).

ADME-Tox Analysis
In this research, the molecular properties of the selected ligands from previous docking simulation results were carried out by using DataWarrior software and OSIRIS Property Explorer online web service. These softwares not only predicted the molecular properties, but also the assume drug scores, which translated as an ability of compounds to become a drug, based on their molecular properties and druglikeness value. The result of these tests showed that all five ligands did not violate any Lipinski's RO5, while all standard ligands have a logP value lower than −0.5, a hydrogen bond acceptor more than 10, and a TPSA higher than 140 Å 2 . Interestingly, Compound C-7756 was the only compound in this test that has a positive druglikeness value, sitting in 1.45, while others have a negative druglikeness value, with SAM, SAH, and SFG at a negative druglikeness value among all with −9.09, −18.44, and 19.10, respectively. The positive result of Compound C-7756 might due to its low TPSA compared to the other four NP ligands (23.47 Å 2 , compared to 43.70 Å 2 for both Compound C-5769 and Compound C-1723, and 67.17 Å2 for both Compound C-2129 and C-2140). Hence, Compound C-7756 has the highest drug score among all five of the best NP ligands at 0.41, despite still being lower than both SAH and SFG at 0.42. These results can be seen in Table 4.

ADME-Tox Analysis
In this research, the molecular properties of the selected ligands from previous docking simulation results were carried out by using DataWarrior software and OSIRIS Property Explorer online web service. These softwares not only predicted the molecular properties, but also the assume drug scores, which translated as an ability of compounds to become a drug, based on their molecular properties and druglikeness value. The result of these tests showed that all five ligands did not violate any Lipinski's RO5, while all standard ligands have a logP value lower than −0.5, a hydrogen bond acceptor more than 10, and a TPSA higher than 140 Å 2 . Interestingly, Compound C-7756 was the only compound in this test that has a positive druglikeness value, sitting in 1.45, while others have a negative druglikeness value, with SAM, SAH, and SFG at a negative druglikeness value among all with −9.09, −18.44, and 19.10, respectively. The positive result of Compound C-7756 might due to its low TPSA compared to the other four NP ligands (23.47 Å 2 , compared to 43.70 Å 2 for both Compound C-5769 and Compound C-1723, and 67.17 Å2 for both Compound C-2129 and C-2140). Hence, Compound C-7756 has the highest drug score among all five of the best NP ligands at 0.41, despite still being lower than both SAH and SFG at 0.42. These results can be seen in Table 4. Table 4. The molecular properties, druglikeness, and drug score predictions of five ligands and three standard compounds using DataWarrior software and OSIRIS Property Explorer online web service.

No
Compound Name The mutagenicity and carcinogenicity potency of the selected ligands were identified using Toxtree v2.6.6 software. This software predicts these properties based on the chemical structures that the ligand possessed, which were then compared to the carcinogenic/mutagenic database that corresponded to the software. Toxtree v2.6.6 analyzes the ligand carcinogenicity based on three different mechanisms: genotoxic, non-genotoxic, and quantitative structure-activity relationship (QSAR) carcinogenicities [32,39]. Meanwhile, the ligand mutagenicity was predicted based on the Ames test, which includes the usage of Salmonella typhimurium as the original sample [40]. According to the result shown in Table 5, all five of the best ligands did not show any mutagenic nor carcinogenic properties, as they did not possess any fragments that may lead to carcinogenicity or mutagenicity. In contrast, all standard ligands, SAH, SAM, and SFG, were predicted to become the genotoxic carcinogenic agents, which may happen due to primary aromatic amines that these ligands possessed in their respective molecular structures. Table 5. The mutagenicity/carcinogenicity prediction of five ligands and three standard compounds using Toxtree v2.6.6 software.

No
Compound Name Genotoxic Carcinogenicity

Mutagenicity (Salmonella Typhimurium)
According to the results in Table 6, all five ligands were predicted to act as a substrate of P-gp, while they also did not possess any inhibitor potency of P-gp as well. Moreover, these ligands acted as a CYP450 substrate, particularly as a CYP3A4 substrate. These results were forecasted in the beginning since the P-gp substrate is much more likely to behave as a CYP3A4 substrate as well [41]. In addition, the biodegradable potencies of all five ligands were also identified as well, since the non-biodegradable compounds should be cautiously monitored since they may harm the environments, especially posing a risk to aquatic life such as fish [42]. In this study, however, all ligands, including the standard ligands, have no biodegradability capacity over the biological organism. Finally, the AMES toxicity and the carcinogenicity predictions of these ligands were also observed, and were predicted as non-AMES toxicant and non-carcinogenic agents. These results were similar to those from the previous test obtained from Toxtree v2.6.6 software. Finally, the oral bioavailability, PAINS, and synthetic accessibility predictions were predicted in this study as well. These predictions were performed using the SwissADME web service [34]. The first indicator of this prediction was the gastrointestinal absorption, which was influenced by the substance physiochemical state [43]. Molecular traits such as MW, logP, and TPSA profoundly affected the capability of GI absorption in the human body, which inspired the Lipinski's RO5, as well as Veber's and Egan's rule to be applied in determining the potential substance that can be absorbed well in an oral administration system [44][45][46]. In this study, the SwissADME prediction demonstrated that all five ligands have a high GI absorption towards the human body, and it was later confirmed that all ligands also passed the Veber's, Egan's, and Lipinski's RO5 as well, violating none of these rules according to this result. Contrariwise, neither SAH, SAM, nor SFG have a high GI absorption, probably due to their high TPSA and low logP, which ultimately violate those rules and decrease their GI absorption. However, despite these results, all ligands were shown to have a moderate bioavailability score at 0.55.
The final two predictions of SwissADME on five best and three standard ligands were focused on the pan-assay interference (PAINS) and synthetic accessibility (SA) predictions, and according to the results, Compound C-2129 and Compound C-2140 have a positive result on the PAINS assay, mainly due to catechol fragments that reside in both compounds. These results may lead to false-positive results in the high-throughput screen (HTS) for biological targets [47]. Thus, these compounds should be noted with care when they are going to be screened through the HTS method. However, compared to the other three ligands, both Compound C-2129 and Compound C-2140 have lower SA values, which means that these ligands were easier to be synthesized compared to others, consequently reducing the cost and time to make these compounds in the laboratory. All results from SwissADME software can be seen in Table 7.

Discussion
Molecular profiling analysis was conducted to differentiate the subtype of breast cancer, namely normal breast-like, basal-like, luminal A, luminal B, and HER-2 breast cancer [48]. Drug discovery has been explored from the previous research to treat breast cancer patients. For example, Pertuzumab and Trastuzumab are two examples of drugs used in HER2-positive breast cancer patients [49]. DNMT1 is most highly expressed in basal-like breast cancer. It is also distinctly expressed in other types of breast cancer according to their molecular and stromal subtypes [50].
DNMT1 is an epigenetic regulator enzyme responsible for forming and maintaining DNA methylation patterns. In mammalian, DNMT1 is an essential enzyme in the mammalian genome functional system. Studies of DNA methylation can provide information in the current biomedical sciences, such as carcinogenesis, host infection by different viruses, cell differentiation, autoimmune diseases, different types of mental illness, neurological disorders, and environmental toxicology [51,52]. Hypermethylated promoters due to extensive DNA methylation may serve as a biomarker. Unlike the other irreversible genetic alterations, DNA methylation is reversible, making it a compelling approach for breast cancer therapy [53].
The DNMT inhibitors can provide novel and efficacious solutions for patients who suffer from hematological malignancies but also other cancer types. Two azanucleosidesbased DNMT inhibitors have been approved by the US Food and Drug Administration (FDA) in 2013, namely decitabine (5 aza 2 deoxycytidine) and azacytidine (Vidaza; Celgene). At lower doses, decitabine and azacytidine induce a strong demethylating effect, leading to re-expression of aberrantly silenced genes associated with reduced proliferation, apoptosis, senescence, and cell differentiation. Despite their clinical efficacy, DNA damage is observed after the incorporation of higher doses of decitabine and azacytidine. Moreover, their limitations extended to high toxicity, instability in physiological media, and poor bioavailability [54].
Natural products have been extensively studied for their function demethylating agents or DNMT inhibitors. Several flavonoids, anthraquinones, polyphenols, and other natural products have been known to inhibit the DNA methylation process by DNMTs, thus decreasing the silencing of various genes involved in tumorigenesis. This may lead to the re-expression of oncogenes in diverse cancer cell lines. Laccaic acid A and epigallocatechin-3-gallate have demonstrated their potent activity as DNMT1 competitive inhibitors with submicromolar IC50 values [18,19]. Despite the current advancement in the development of DNMT1 inhibitors, the pursuit to discover peculiar compounds targeting DNMTs, which are not only effective but are also more selective and less toxic, should be continued. Meanwhile, designing an inhibitor for which action relies on the reactivation of abnormally silenced tumor suppressor genes would be quite challenging. Hence, targeting DNMTs is a more feasible approach.
Fragment growing is an approach to improving potency and pharmacological properties by the addition of functional groups or substituents to the fragmented core. It is used to optimize their structure into favorable interactions with the binding site residues. The natural product compounds were chosen as a fragment library. All fragments were subjected to initial pharmacological screening with toxicity and Rule of 3 (RO3) filter. The "Rule of three" states that fragments should have a molecular weight ≤ 300 Da, cLogP ≤ 3, a hydrogen bond acceptor count ≤ 3, and number of hydrogen bond donors ≤ 3. Their analysis also indicates that using additional filters, such as rotatable bonds count ≤ 3 and the total polar surface area (TPSA) ≤ 60 Å, would give more desirable fragment-like compounds [31,55,56].
The absorption, distribution, metabolism, excretion, and toxicity (ADME-Tox) property predictions of any drug candidates become an inevitable method in drug discovery and development (CADDD). It is estimated that 30% of drug attrition has been caused by drug failures, mainly triggered by the unwanted ADME-Tox properties of the drug itself [57]. However, neither in vitro nor in vivo investigations to determine these properties are inexpensive and time effective. This is because they heavily contribute to the high cost of developing new drugs nowadays, which approximately takes about 2.6 billion USD and 14-20 years from the initial phase in the laboratory until it is widely marketed [20,58]. In recent years, computational-based ADME-Tox predictions, which offer a valuable, safe, cheap, and rapid method to accurately determine the molecule properties based on the structural alerts, for instance, had become an invaluable tool in CADD and have been routinely performed before the drug candidates had been synthesized in a wet laboratory [59,60]. In this study, some software has been carried out to identify the ADME-Tox properties of the selected ligands from docking simulations results, such as DataWarrior [28,29], Toxtree v2.6.6 [32], admetSAR [33], and SwissADME [34].
The molecular properties of the ligands may determine their ability to be easily absorbed into the human body via oral administration. Hence, Lipinski's Rule of Five (RO5) was popularized and has possibly been the most leading, yet simple concept in CADDD and medicinal chemistry fields in the last few decades [56,61]. This rule revolves around five different molecular properties; logP, molecular weight (MW), hydrogen bond acceptor and donor, and topological polar surface area (TPSA). Overall, Lipinski's RO5 stated that any compound has a high probability of having poor permeability and absorption through oral administration when the compound has either a molecular weight higher than 500 Dalton, a logP higher than 5.0, a hydrogen bond acceptor more than 10, or a hydrogen bond donor more than 5 [45,56]. Additionally, a higher TPSA value than 140 Å 2 is also accountable for low oral absorption of the drug molecule as well [46].
The potency of a compound to become either a substrate or an inhibitor for both P-glycoprotein (P-gp) and Cytochrome (CYP) 450 enzymes also determines effectiveness and efficiency when it acts like a drug in the human body. P-gp is a drug transporter that plays an imperative role in preventing toxic substances by limiting its absorption when administered orally; this protein also plays a significant part in drug-drug interaction [62]. Furthermore, any compound that can work as a P-gp substrate may affect its functions, either as an inducer or an inhibitor, which can decrease and increase its bioavailability in the human body, respectively [41]. On the other hand, the CYP450 enzymes are one of the essential metabolizing enzymes that are mainly involved in various oxidizing reactions for xenobiotic compounds [63]. Out of five common CYPs involved in these reactions, the CYP450 3A4 is the most important one, and is accountable in the metabolizing processes for more than half the marketed drugs in the world [64]. In this study, these properties can be identified using admetSAR web services [33].

Conclusions
Our results showed that C-7756, C-5769, C-1723, C-2129, and C-2140 have a higher affinity to DNMT1 compared to the standards (SAH, SAM, and SFG), which is determined by their lower ∆Gbinding. Moreover, the selected ligands have pharmacological advantages in terms of druglikeness, GI absorption, and oral bioavailability compared to the standards. Having the lowest ∆Gbinding and least-unwanted ADME-Tox properties, our results indicated that C-7756 has the potential to be a drug lead for inhibiting DNMT1 for breast cancer therapy. Finally, our results must be further examined through molecular dynamic simulation as well as through in vitro and in vivo methods to investigate its potential in the biological condition.