Design of Novel IRAK4 Inhibitors Using Molecular Docking, Dynamics Simulation and 3D-QSAR Studies

Treatment of several autoimmune diseases and types of cancer has been an intense area of research over the past two decades. Many signaling pathways that regulate innate and/or adaptive immunity, as well as those that induce overexpression or mutation of protein kinases, have been targeted for drug discovery. One of the serine/threonine kinases, Interleukin-1 Receptor Associated Kinase 4 (IRAK4) regulates signaling through various Toll-like receptors (TLRs) and interleukin-1 receptor (IL1R). It controls diverse cellular processes including inflammation, apoptosis, and cellular differentiation. MyD88 gain-of-function mutations or overexpression of IRAK4 has been implicated in various types of malignancies such as Waldenström macroglobulinemia, B cell lymphoma, colorectal cancer, pancreatic ductal adenocarcinoma, breast cancer, etc. Moreover, over activation of IRAK4 is also associated with several autoimmune diseases. The significant role of IRAK4 makes it an interesting target for the discovery and development of potent small molecule inhibitors. A few potent IRAK4 inhibitors such as PF-06650833, RA9 and BAY1834845 have recently entered phase I/II clinical trial studies. Nevertheless, there is still a need of selective inhibitors for the treatment of cancer and various autoimmune diseases. A great need for the same intrigued us to perform molecular modeling studies on 4,6-diaminonicotinamide derivatives as IRAK4 inhibitors. We performed molecular docking and dynamics simulation of 50 ns for one of the most active compounds of the dataset. We also carried out MM-PBSA binding free energy calculation to identify the active site residues, interactions of which are contributing to the total binding energy. The final 50 ns conformation of the most active compound was selected to perform dataset alignment in a 3D-QSAR study. Generated RF-CoMFA (q2 = 0.751, ONC = 4, r2 = 0.911) model revealed reasonable statistical results. Overall results of molecular dynamics simulation, MM-PBSA binding free energy calculation and RF-CoMFA model revealed important active site residues of IRAK4 and necessary structural properties of ligand to design more potent IRAK4 inhibitors. We designed few IRAK4 inhibitors based on these results, which possessed higher activity (predicted pIC50) than the most active compounds of the dataset selected for this study. Moreover, ADMET properties of these inhibitors revealed promising results and need to be validated using experimental studies.


Introduction
One of the significant serine/threonine kinases, IRAK4 (Interleukin-1 receptor associated kinase 4), is essential for the scaffolding and phosphorylation of the toll-like receptor (TLR) [1,2]. IRAK1, IRAK2, IRAK-M, and IRAK4 are all homologs of the IRAK protein and they have an N-terminal death domain, a linker, and a kinase domain; however, only IRAK4 lacks a C-terminal extension [3,4]. IRAK4 kinase domain contains typical two lobe conformation observed in several protein kinases with the ATP binding pocket. The Nterminal lobe of IRAK4 has two distinctive features: An N-terminal extension of unknown function and a tyrosine as the gatekeeper residue [3,4]. Usually, in the ATP-binding pocket, the key gatekeeper is responsible for providing access to the pre-existing hydrophobic pocket shaped by the residues of α helix C, DFG motif, and the gatekeeper residue. The existence of a tyrosine gatekeeper, which is unique to the IRAK family, has significance for the development of selective IRAK4 inhibitors [3,5].
IRAKs are important mediators of interleukin-1 receptor (IL1R) and TLR signalling processes including IL-33, IL-18, and IL-1 receptors [2]. TLR/IL1R-mediated signalling regulates multiple cellular processes such as apoptosis, inflammation, and cellular differentiation. TLR/IL1R signalling is associated with the employment of adaptor molecules such as Mal/TIRAP, MyD88, TRAM and TRIF that play crucial role in the operating and activation of IRAK family [3,6]. When IRAK4 is recruited to MyD88, activation of IRAK4 causes IRAK1 and/or IRAK2 to be phosphorylated [7]. The MyD88:IRAK4:IRAK2 het-eromeric structure is known the myddosome complex. When the myddosome complex is formed, IRAK1 phosphorylation initiates the accomplishment of scaffolding with TRAF6 that further induces downstream signaling such as NFκB-mediated transcription activation [8].
In combination with B cell and T cell hyperactivity, abnormal IRAK4 activation stimulates the inflammatory chemokine and cytokine pathways, which increases autoimmune signaling in the associated disease [9]. Other study disclosed that human patients missing IRAK4 are seriously hyporesponsive and immunocompromised to IL-1 and LPS [10]. Therefore, IRAK4 has a vital role in innate immunity and its inhibition by small molecules would be significant for the treatment of various types of inflammations and autoimmune disease such as inflammatory bowel disease (IBD), rheumatoid arthritis (RA) and Systematic Lupus Erythematosus (SLE) [8,11].
Moreover, IRAK4 has a vital role in the growth of several malignancies [11]. Patients with melanoma have been found to have higher basal levels of IRAK4 phosphorylation [12]. MyD88 gain-of-function mutations (L265P somatic mutation) lead to numerous rare and difficult to treat hematological malignancies such as activated B cell diffuse large B-cell lymphoma (ABC-DLBCL), Waldenström macroglobulinemia (WM), chronic lymphocytic leukemia (CLL) and pancreatic ductal adenocarcinoma (PDAC) [13,14]. Aberrant activation of the NF-κB is also responsible for various cancer types, including colorectal cancer [15]. The evident role of IRAK4 inhibitors as probable anticancer and anti-inflammatory agents impelled us to perform several molecular modelling studies including 3D-QSAR toward the discovery of novel IRAK4 inhibitors. Although there have been previous reports aimed at the generation of IRAK4 inhibitors, the development of cancer drugs is currently of renewed interest.
Over the past few years, there have been dozens of patent applications as well as the peer reviewed literature filed/published appealing small molecule inhibitors of IRAK4 [16,17]. It seems that autoimmune and cancer diseases are being targeted. Few pharma companies have proposed their inhibitors, among which Pfizer is the most advanced, having completed a phase II clinical study with their candidate compound, PF-06650833 (IC 50 = 0.2 nM) for rheumatoid arthritis (RA) and BAY 1834845 has entered phase 1 clinical trials [17,18]. Additionally, Aurigene/Curis and BMS companies proposed their compounds, CA-4948 (IC 50 < 50 nM) and BMS-986126 (IC 50 = 5.3 nM) respectively [16,17]. Each candidate small molecule has demonstrated reasonable efficacy in several murine autoimmune disease models and are currently in Phase I or II clinical trials [1,4]. However, the design of potent and selective IRAK4 inhibitors for the treatment of cancer is still required.
Hence, in the current study, we have implemented various molecular modelling studies on the diaminonicotinamide derivatives as IRAK4 anticancer inhibitors. Initially we executed molecular docking and 50 ns dynamics simulation (MD) of the most active compound 33 from selected dataset for this study [2], following MM-PBSA binding free energy calculations. These studies helped us to understand the binding mode of IRAK4 inhibitor and key active site residues of IRAK4, which contribute most to the total binding energy. We then performed (3D-QSAR) 3D-quantitative structure-activity relationship by using the 50 ns pose from MD to align dataset compounds into the binding site of receptor. Altogether, analyses of results revealed crucial structural characteristics of ligand to improve the potency. We designed few IRAK4 inhibitors that possess better-predicted activity (pIC 50 ) than the most active compound of the dataset used in this study. Our design scheme and predicted ADMET values could be useful for medicinal chemists or pharmaceutical companies to develop novel IRAK4 inhibitors. Further experimental studies need to be performed to validate our designed IRAK4 inhibitors.

Molecular Docking
The co-crystalized ligand was re-docked into the IRAK4 active site to evaluate the docking technique. The co-crystallized ligand and the re-docked ligand had similar binding conformations and H-bond interactions. Their difference in RMSD (root mean square deviation) was 1.20 Å that proved the docking technique was reliable. Then we docked the most active compound 33 from selected dataset into the active site of IRAK4 ( Figure 1A). Based on the lowest binding energy and binding interactions with the active site residues, compound 33 s docked pose was chosen. Binding energy of the compound 33 with IRAK4 was found to be -7.94 kcal/mol which showed six hydrogen bonds. Three hydrogen bonds were formed with the key hinge region residues. The NH from methylnicotinamide and pyridyl nitrogen formed two hydrogen bonds with residue Met265. Third hinge region hydrogen bond was observed between residue Val263 and the NH of amide. These three hydrogen bonds are known as 'classical triad hinge binding interaction', which are important interactions and were seen in docking of co-crystallized ligand as well as other IRAK4 inhibitors [2]. The two OH groups from phenylbutane-1,2-diol formed 3 hydrogen bonds with residues Ser269 and Asp272 respectively. Furthermore, indoline ring was docked into the hydrophobic pocket and it formed pi-pi interaction with the gatekeeper residue Tyr262. The interaction with Tyr262 was reported to be unique and essential for the inhibition of IRAK4 [2]. compound 33 from selected dataset for this study [2], following MM-PBSA binding free energy calculations. These studies helped us to understand the binding mode of IRAK4 inhibitor and key active site residues of IRAK4, which contribute most to the total binding energy. We then performed (3D-QSAR) 3D-quantitative structure-activity relationship by using the 50 ns pose from MD to align dataset compounds into the binding site of receptor. Altogether, analyses of results revealed crucial structural characteristics of ligand to improve the potency. We designed few IRAK4 inhibitors that possess better-predicted activity (pIC50) than the most active compound of the dataset used in this study. Our design scheme and predicted ADMET values could be useful for medicinal chemists or pharmaceutical companies to develop novel IRAK4 inhibitors. Further experimental studies need to be performed to validate our designed IRAK4 inhibitors.

Molecular Docking
The co-crystalized ligand was re-docked into the IRAK4 active site to evaluate the docking technique. The co-crystallized ligand and the re-docked ligand had similar binding conformations and H-bond interactions. Their difference in RMSD (root mean square deviation) was 1.20 Å that proved the docking technique was reliable. Then we docked the most active compound 33 from selected dataset into the active site of IRAK4 ( Figure 1A). Based on the lowest binding energy and binding interactions with the active site residues, compound 33′s docked pose was chosen. Binding energy of the compound 33 with IRAK4 was found to be -7.94 kcal/mol which showed six hydrogen bonds. Three hydrogen bonds were formed with the key hinge region residues. The NH from methylnicotinamide and pyridyl nitrogen formed two hydrogen bonds with residue Met265. Third hinge region hydrogen bond was observed between residue Val263 and the NH of amide. These three hydrogen bonds are known as 'classical triad hinge binding interaction', which are important interactions and were seen in docking of co-crystallized ligand as well as other IRAK4 inhibitors [2]. The two OH groups from phenylbutane-1,2diol formed 3 hydrogen bonds with residues Ser269 and Asp272 respectively. Furthermore, indoline ring was docked into the hydrophobic pocket and it formed pi-pi interaction with the gatekeeper residue Tyr262. The interaction with Tyr262 was reported to be unique and essential for the inhibition of IRAK4 [2]. The docked pose of compound 33 was further analyzed to assess hydrophobic interactions. A Python script 'colour h' was utilized to colour the hydrophobic residues of IRAK4 and check their interactions with compound 33. This script uses an Eisenberg hydrophobicity scale ( Figure 1B) to colour the receptor in PyMOL [19]. The most The docked pose of compound 33 was further analyzed to assess hydrophobic interactions. A Python script 'colour h' was utilized to colour the hydrophobic residues of IRAK4 and check their interactions with compound 33. This script uses an Eisenberg hydrophobicity scale ( Figure 1B) to colour the receptor in PyMOL [19]. The most hydrophobic residues were colored red, and the least hydrophobic area was colored white. The amide group and indoline ring are docked deep inside the hydrophobic pocket occupying residues Val200, Ala211, Leu318 and Tyr262. As discussed earlier, Tyr262 is one of the im-portant residue in the active site of IRAK4 and unique to the IRAK family as a gatekeeper residue, interaction of which makes it crucial for IRAK4 inhibition. It was also reported that the interaction with Tyr262 plays an efficient role in selectivity over other kinases such as JAK3 [2]. Therefore, based on observed important interactions of compound 33 with IRAK4 and its low binding energy, this pose was selected for further validation using molecular dynamics simulation.

Molecular Dynamics Simulation
To examine the binding stability and conformation of the ligand, Gromacs-2018 [20] was used to conduct MD simulation of the docked complex of compound 33 and IRAK4. A production run of 50 ns MD simulation was performed. Figure 2A shows the root mean square deviation (RMSD) for a ligand and protein. The protein RMSD and ligand RMSD are shown in red and black colored lines in the graph, respectively. The plot indicates that the protein's RMSD approached stability at 20 ns, varied somewhat between 25 and 40 ns, and then stabilized at the end of the simulation, indicating that the protein's stable conformation was attained. Prior to the 50 ns simulation, no variations with less than 0.1 nm deviations were seen in compound 33 s RMSD, which stabilised at 18 ns. During the simulation, there were hardly any fluctuations seen, with the exception of the loop sections, and the inclusive fluctuation was less than 2Å. At the end of the simulation, the system was in equilibrium, according to overall RMSD analyses.
occupying residues Val200, Ala211, Leu318 and Tyr262. As discussed earlier, Tyr262 is one of the im-portant residue in the active site of IRAK4 and unique to the IRAK family as a gatekeeper residue, interaction of which makes it crucial for IRAK4 inhibition. It was also reported that the interaction with Tyr262 plays an efficient role in selectivity over other kinases such as JAK3 [2]. Therefore, based on observed important interactions of compound 33 with IRAK4 and its low binding energy, this pose was selected for further validation using molecular dynamics simulation.

Molecular Dynamics Simulation
To examine the binding stability and conformation of the ligand, Gromacs-2018 [20] was used to conduct MD simulation of the docked complex of compound 33 and IRAK4. A production run of 50 ns MD simulation was performed. Figure 2A shows the root mean square deviation (RMSD) for a ligand and protein. The protein RMSD and ligand RMSD are shown in red and black colored lines in the graph, respectively. The plot indicates that the protein's RMSD approached stability at 20 ns, varied somewhat between 25 and 40 ns, and then stabilized at the end of the simulation, indicating that the protein's stable conformation was attained. Prior to the 50 ns simulation, no variations with less than 0.1 nm deviations were seen in compound 33′s RMSD, which stabilised at 18 ns. During the simulation, there were hardly any fluctuations seen, with the exception of the loop sections, and the inclusive fluctuation was less than 2Å. At the end of the simulation, the system was in equilibrium, according to overall RMSD analyses. Hydrogen bond analysis of compound 33 at 50 ns showed that it formed 3 H-bonds with the crucial residues of IRAK4 from hinge region ( Figure 3A). These 3 hydrogen bonds with residues Val263 and Met265 (classical triad hinge binding interaction) are same as that of hydrogen bonds observed in the docking analysis above. These three interactions were consistent throughout the 50 ns of simulation ( Figure 2B). Therefore, 50 ns pose of compound 33 retained three crucial hydrogen bonds with IRAK4. The pi-pi interaction of indoline with Tyr262 was also observed. But other 3 hydrogen bonds with residues Ser269 and Asp272 were lost after 50 ns MD simulation due to the change in the conformation of phenylbutane-1,2-diol moiety. This conformational change was expected due to a special tyrosine gatekeeper residue, IRAK4 lacks a rear pocket in the ATP binding site, and its solvent-exposed region is larger than that of other kinases [2]. It was also reported previously that efforts to strengthen the H-bond to the carboxylic acid of Asp272 were less successful [2]. However, pi-pi interaction and important hydrophobic interactions with Hydrogen bond analysis of compound 33 at 50 ns showed that it formed 3 H-bonds with the crucial residues of IRAK4 from hinge region ( Figure 3A). These 3 hydrogen bonds with residues Val263 and Met265 (classical triad hinge binding interaction) are same as that of hydrogen bonds observed in the docking analysis above. These three interactions were consistent throughout the 50 ns of simulation ( Figure 2B). Therefore, 50 ns pose of compound 33 retained three crucial hydrogen bonds with IRAK4. The pi-pi interaction of indoline with Tyr262 was also observed. But other 3 hydrogen bonds with residues Ser269 and Asp272 were lost after 50 ns MD simulation due to the change in the conformation of phenylbutane-1,2-diol moiety. This conformational change was expected due to a special tyrosine gatekeeper residue, IRAK4 lacks a rear pocket in the ATP binding site, and its solvent-exposed region is larger than that of other kinases [2]. It was also reported previously that efforts to strengthen the H-bond to the carboxylic acid of Asp272 were less successful [2]. However, pi-pi interaction and important hydrophobic interactions with residues Tyr262, Val200, Ala211 and Leu318 were reproduced at the end of 50 ns MD simulation ( Figure 3B). Hence, we considered 50 ns pose as a reasonable binding pose of compound 33 and used it as a template structure for the alignment of dataset compounds in 3D-QSAR studies. residues Tyr262, Val200, Ala211 and Leu318 were reproduced at the end of 50 ns MD simulation ( Figure 3B). Hence, we considered 50 ns pose as a reasonable binding pose of compound 33 and used it as a template structure for the alignment of dataset compounds in 3D-QSAR studies.

MM/PBSA Binding Free Energy Calculation
The MM/PBSA package [21] was utilized to compute the binding affinity of compound 33. The predicted binding free energy was -112.841 kJ/mol. It is the sum of Van der Waal energy of -220.417 kJ/mol, electrostatic energy of -45.771 kJ/mol, polar salvation energy of 173.117 kJ/mol and SASA energy of -19.859 kJ/mol. Van der Waals energy and electrostatic energy were important for compound 33′s binding with IRAK4. However, the binding of component 33 did not benefit from the polar salvation energy. In our docking and MD analyses, most of the interactions formed by compound 33 were hydrogen bonds and hydrophobic that were found to be consistent. This explains why Van der Waals energy contributed the most among them. Additionally, for a thorough understanding of the ligand-protein interactions, we carried out a binding free energy decomposition analysis. The energy decomposition of each residue is depicted in column chart ( Figure 4). The main contribution to the binding of compound 33 was from residues Tyr262, Leu318, Tyr264, Met265, Ala211 and Val263, which were involved in the hydrogen bond and hydrophobic interactions. On the contrary, residues Glu238 and Lys213 were in disfavor with the binding of compound 33. In conclusion, the study of the binding free energy demonstrated the role of essential active site residues in IRAK4 inhibition.

MM/PBSA Binding Free Energy Calculation
The MM/PBSA package [21] was utilized to compute the binding affinity of compound 33. The predicted binding free energy was -112.841 kJ/mol. It is the sum of Van der Waal energy of -220.417 kJ/mol, electrostatic energy of -45.771 kJ/mol, polar salvation energy of 173.117 kJ/mol and SASA energy of -19.859 kJ/mol. Van der Waals energy and electrostatic energy were important for compound 33 s binding with IRAK4. However, the binding of component 33 did not benefit from the polar salvation energy. In our docking and MD analyses, most of the interactions formed by compound 33 were hydrogen bonds and hydrophobic that were found to be consistent. This explains why Van der Waals energy contributed the most among them. Additionally, for a thorough understanding of the ligand-protein interactions, we carried out a binding free energy decomposition analysis. The energy decomposition of each residue is depicted in column chart ( Figure 4). The main contribution to the binding of compound 33 was from residues Tyr262, Leu318, Tyr264, Met265, Ala211 and Val263, which were involved in the hydrogen bond and hydrophobic interactions. On the contrary, residues Glu238 and Lys213 were in disfavor with the binding of compound 33. In conclusion, the study of the binding free energy demonstrated the role of essential active site residues in IRAK4 inhibition.

3D-QSAR (CoMFA and RF-CoMFA)
Receptor-based comparative molecular field analysis (CoMFA) and region focused CoMFA (RF-CoMFA) [22] models were developed for the diaminonicotinamide derivatives. All dataset compounds were sketched and aligned inside the active site of IRAK4 using the MD conformation of the most active compound 33 as a template in SYBYL-X 2.1. The alignment of the dataset compounds is shown in Figure 5. The dataset compounds were separated into training set (26) and test set (12) using the standards given by Golbraikh et al. and algorithm 4 (activity ranking) in the reported article [23]. We chose algorithm

3D-QSAR (CoMFA and RF-CoMFA)
Receptor-based comparative molecular field analysis (CoMFA) and region CoMFA (RF-CoMFA) [22] models were developed for the diaminonico derivatives. All dataset compounds were sketched and aligned inside the activ IRAK4 using the MD conformation of the most active compound 33 as a tem SYBYL-X 2.1. The alignment of the dataset compounds is shown in Figure 5. Th compounds were separated into training set (26) and test set (12) using the s given by Golbraikh et al. and algorithm 4 (activity ranking) in the reported article chose algorithm 4 (activity ranking) because there are no large gaps in activity v dataset compounds and algorithm 4 can construct a test set that represents th range of activities. Thus, our test set contains compounds having high, medium, activity (pIC50) values. For assessing the reliability of a 3D-QSAR model, it is essential to calculat statistical parameters using the partial least square (PLS) method, such as cross-v correlation coefficient (q 2 ), non-cross-validated correlation coefficient (r 2 ), standa

3D-QSAR (CoMFA and RF-CoMFA)
Receptor-based comparative molecular field analysis (CoMFA) and region focus CoMFA (RF-CoMFA) [22] models were developed for the diaminonicotinami derivatives. All dataset compounds were sketched and aligned inside the active site IRAK4 using the MD conformation of the most active compound 33 as a template SYBYL-X 2.1. The alignment of the dataset compounds is shown in Figure 5. The datas compounds were separated into training set (26) and test set (12) using the standar given by Golbraikh et al. and algorithm 4 (activity ranking) in the reported article [23]. W chose algorithm 4 (activity ranking) because there are no large gaps in activity values dataset compounds and algorithm 4 can construct a test set that represents the who range of activities. Thus, our test set contains compounds having high, medium, and lo activity (pIC50) values. For assessing the reliability of a 3D-QSAR model, it is essential to calculate sever statistical parameters using the partial least square (PLS) method, such as cross-validat correlation coefficient (q 2 ), non-cross-validated correlation coefficient (r 2 ), standard err of estimate (SEE), optimal number of components (ONC), and F value. We develop CoMFA models (q 2 = 0.502, ONC = 4, r 2 = 0.823) for the full dataset. These statistical valu were in an acceptable range but quite low to consider them as a good predictive mod Therefore, we derived region focused CoMFA by using the PLS analysis obtained in t CoMFA model (RF-CoMFA: q 2 = 0.527, ONC = 6, r 2 = 0.905). The obtained RF-CoMF For assessing the reliability of a 3D-QSAR model, it is essential to calculate several statistical parameters using the partial least square (PLS) method, such as cross-validated correlation coefficient (q 2 ), non-cross-validated correlation coefficient (r 2 ), standard error of estimate (SEE), optimal number of components (ONC), and F value. We developed CoMFA models (q 2 = 0.502, ONC = 4, r 2 = 0.823) for the full dataset. These statistical values were in an acceptable range but quite low to consider them as a good predictive model. Therefore, we derived region focused CoMFA by using the PLS analysis obtained in the CoMFA model (RF-CoMFA: q 2 = 0.527, ONC = 6, r 2 = 0.905). The obtained RF-CoMFA model possessed better statistical results hence it was selected for further validation. RF-CoMFA model derived using external test set validation (q 2 = 0.751, ONC = 4, r 2 = 0.911) showed highest q 2 and r 2 values. The latter model was selected as a final model due to its better q 2 and r 2 values. The detailed statistical values of the selected RF-CoMFA model are given in Table 1.

Validation of RF-CoMFA Model
A number of validation techniques were used to assess robustness and predictive ability of produced CoMFA model. All the techniques, such as bootstrapping, predictive r 2 (external test set), progressive scrambling (Q 2 ), and rm 2 metric calculation, exhibited statistical values that were within the adequate range [24,25]. These findings showed that the chosen model was reliable and predictive. Detailed statistical values are shown in Table 1. In Table S2 of the Supplementary Material, the experimental and predicted activity values for this model are presented. The scatter plot for the same is shown in Figure 6. The compound 33 is shown superimposed with RF-CoMFA contour maps into the active site of IRAK4. A number of validation techniques were used to assess robustness and predic ability of produced CoMFA model. All the techniques, such as bootstrapping, predic r 2 (external test set), progressive scrambling (Q 2 ), and rm 2 metric calculation, exhib statistical values that were within the adequate range [24,25]. These findings showed the chosen model was reliable and predictive. Detailed statistical values are show Table 1. In Table S2    substitutions, while yellow and red colors indicate regions that are unfavorable for these types of replacements. Figure 6. Scatter plot for the selected RF-CoMFA model; the plot shows the actual pIC50 versus predicted pIC50 activity of the training and test sets; the training set compounds are represented as blue diamonds; the test set compounds are represented as red squares.  In steric contour map, a big green-colored contour ( Figure 7A) was observed at R 2 position of the propan-1,2-diol moiety, indicating that in this area, bulky groups are preferred to elevate the activity. Substituting a steric group at R 2 position could interact with many hydrophobic pocket residues of IRAK4. The hydrophobic interactions with residues Leu318 and Gly268 seen in our docking and MD simulation study of compound 33 can help to explain this. Similarly, two small yellow colored contours were present near both phenyl ring of the R 2 substitution and methylacetamide at R 1 position, which suggests that this region is unfavorable for the bulky groups. Adding bulky group at these positions may hinder with the hinge region residues. In the electrostatic contour map ( Figure 7B), a small blue colored contour was located near the NH of indoline ring at the R 3 position, which shows that the electropositive group in this position is favorable and may interact via H-bonds with neighboring active site residues. Conversely, a small red colored contour was seen near phenyl ring at R 2 substitution; signifying that electronegative groups at this place are favorable. Thus, overall contour map analysis, docking and MD simulation analyses revealed important structural features of a ligand to improve their potency.

Designing of IRAK4 Inhibitors and Their ADMET Calculation
A reliable RF-CoMFA model development and its contour map analysis was valuable to propose a design strategy to design more potent IRAK4 compounds ( Table 2). The structural characteristics studied and analyzed from contour maps were used to design new IRAK4 inhibitors. Using this strategy, we designed a small number of inhibitors, whose activities were then predicted using the chosen RF-CoMFA model. All designed compounds exhibited predicted activity (pIC50) more than the activity of the most potent compound 33 of the dataset. In Table 2, the proposed compounds' structures and predicted pIC50 values are displayed.  In steric contour map, a big green-colored contour ( Figure 7A) was observed at R 2 position of the propan-1,2-diol moiety, indicating that in this area, bulky groups are preferred to elevate the activity. Substituting a steric group at R 2 position could interact with many hydrophobic pocket residues of IRAK4. The hydrophobic interactions with residues Leu318 and Gly268 seen in our docking and MD simulation study of compound 33 can help to explain this. Similarly, two small yellow colored contours were present near both phenyl ring of the R 2 substitution and methylacetamide at R 1 position, which suggests that this region is unfavorable for the bulky groups. Adding bulky group at these positions may hinder with the hinge region residues. In the electrostatic contour map ( Figure 7B), a small blue colored contour was located near the NH of indoline ring at the R 3 position, which shows that the electropositive group in this position is favorable and may interact via H-bonds with neighboring active site residues. Conversely, a small red colored contour was seen near phenyl ring at R 2 substitution; signifying that electronegative groups at this place are favorable. Thus, overall contour map analysis, docking and MD simulation analyses revealed important structural features of a ligand to improve their potency.

Designing of IRAK4 Inhibitors and Their ADMET Calculation
A reliable RF-CoMFA model development and its contour map analysis was valuable to propose a design strategy to design more potent IRAK4 compounds ( Table 2). The structural characteristics studied and analyzed from contour maps were used to design new IRAK4 inhibitors. Using this strategy, we designed a small number of inhibitors, whose activities were then predicted using the chosen RF-CoMFA model. All designed compounds exhibited predicted activity (pIC 50 ) more than the activity of the most potent compound 33 of the dataset. In Table 2, the proposed compounds' structures and predicted pIC 50 values are displayed.
Additionally, we predicted the ADMET characteristics for each designed compound. Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemical properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online tool (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors. Additionally, we predicted the ADMET characteristics for each designed compound. Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemical properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online tool (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors. Additionally, we predicted the ADMET characteristics for each designed compound. Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemical properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online tool (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors. Additionally, we predicted the ADMET characteristics for each designed compound Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemica properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online too (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors. Additionally, we predicted the ADMET characteristics for each designed compound. Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemical properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online tool (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors. 8 Additionally, we predicted the ADMET characteristics for each designed compound Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemica properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online too (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors. Additionally, we predicted the ADMET characteristics for each designed compound Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemica properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online too (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors. Additionally, we predicted the ADMET characteristics for each designed compound Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemica properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online too (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors. Additionally, we predicted the ADMET characteristics for each designed compound. Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemical properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online tool (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors. Additionally, we predicted the ADMET characteristics for each designed compound. Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemical properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online tool (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors. 8 Additionally, we predicted the ADMET characteristics for each designed compound. Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemical properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online tool (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors. 8 Additionally, we predicted the ADMET characteristics for each designed compound. Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemical properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online tool (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors. Additionally, we predicted the ADMET characteristics for each designed compound. Table 3 displays their properties in detail. We have predicted in silico ADMET (absorption, distribution, metabolism, excretion and toxicity), physicochemical properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness using Artificial Intelligence based In-silico ADME/Tox prediction online tool (https://www.aidrug.re.kr/web/, accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds. Hence, prediction results showed in Table 3 shows that designed inhibitors have promising ADMET properties. These inhibitors could be further evaluated using experimental studies and our design strategy can be used to develop potent and selective IRAK4 inhibitors.

Test Set/Training Set Selection for 3D-QSAR Analyses
A dataset of 38 IRAK4 inhibitors, with the diaminonicotinamide as a common scaffold, was selected for our study [2]. SYBYL-X 2.1 was used to draw and optimize the structures utilizing energy minimization with Tripos force field [26]. To create 3D-QSAR models, biological activities (IC 50 ) were translated into pIC 50 (-log IC 50 ) values and used as dependent variables. The activity log span of pIC 50 values of dataset compounds was more than 3 logarithmic units, that is within the necessity range [24,26]. The dataset was divided into a training set of 26 compounds for model generation and 12 compounds as test set for model validation based on the activity span of compounds. The chemical structures of the dataset compounds with their IC 50 values are listed in Table S1 of supplementary material, in which the test set compounds are denoted by *.

Modeling of the Missing Residues
The crystal structure of IRAK4 with high resolution (PDB ID: 5W85) was obtained from the protein data bank for our study. Missing residues are present in the loop region from residue Ala216 to Thr223 and Glu337 to Gln341, which were modeled and refined using the modellerV9.14 [27]. Taking into consideration the energy, GA341 score [28], and DOPE score [29], the final modeled structure was selected.

Preparation of the Protein and Molecular Docking
We used Autodock 4 to perform molecular docking of the most potent compound 33 of the series [30]. The co-crystal structure (PDB: 5W85) was utilized as a reference to dock the compound 33 inside the active site of the IRAK4 kinase. Before performing docking, the receptor structure was prepared by the addition of polar hydrogens, applying Kollman charges and assigning AD4 atom types. Subsequently, Autodock tools were used to prepare the ligand by keeping the number of rotatable bonds less than 6. The active site grid was produced using the x, y, and z coordinates of the co-crystallized ligand. The grid box was extended to 70 × 70 × 70 points, with a grid spacing of 0.375 Å. The docking was executed using the Lamarckian genetic algorithm (LGA) by setting the number of the genetic algorithm (GA) run to 100 [24]. The docked pose of compound 33 was selected based on its interactions with IRAK4 kinase and the lowest binding energy.

Molecular Dynamics Simulations
The MD simulation was performed using Gromacs-2018 [20]. The protein and ligand topology files were generated using Amber99SB force field [31] and general AMBER force field (GAFF) [32], respectively. The ligand force field parameters were developed using the ACPYPE program [33]. The system was neutralized by adding 13 sodium ions. A threepoint water model (TIP3P) was used as the solvent. Energy minimization was performed by using the steepest descent method for 50,000 steps. Subsequently, the system was equilibrated first via a NVT ensemble for a 100 ps at 300 K using Berendsen thermostat [34] and then using NPT for 100 ps with the constant pressure of 1 atm. The bonds were constrained using the LINCS algorithm [35]. The particle mesh Ewald (PME) method [36] was utilized to handle the long-range coulombic interactions. A 50 ns production run was performed using NPT ensemble at 300 K with 1.0 atm pressure with a time step of 2 fs.

MM/PBSA Binding Free Energy Calculations
The g_mmpbsa package was used to perform molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) free energy calculation [21]. The last 1 ns from the production run of 50 ns MD simulation was utilized for the calculation of binding free energy. The binding free energy contains three energetic terms, including potential energy in vacuum, polar-solvation energy, and nonpolar solvation energy. The molecular mechanics force field parameters were used to calculate both bonded (angle, bond, and dihedral) and non-bonded (electrostatic and van der Waal) interactions included in the potential energy in vacuum. Similarly, the Poisson-Boltzmann equation and solvent accessible surface area (SASA) model was used to calculate polar solvation energy and nonpolar solvation energy, respectively [37]. The assessment of binding free energy for the protein-ligand complex in a solvent was calculated based on the equation given below: where, ∆G binding is the binding free energy and ∆G complex , ∆G protein , and ∆G ligand represent the free energy of complex, protein, and ligand, respectively.

CoMFA and RF-CoMFA
3D-QSAR models were developed using Comparative Molecular Field Analysis (CoMFA) and Region-Focused CoMFA (RF-CoMFA) to correlate the biological activity with the 3D structure of the compounds using SYBYL-X 2.1. CoMFA utilizes the steric and electrostatic potential energies which are calculated using Lennard-Jones and Coulombic potentials respectively. The dataset compounds were aligned using a template molecule (50 ns MD pose of the most active compound 33) [24]. The selection of an appropriate partial charge scheme is important to develop reasonable 3D-QSAR models [24]. We have selected Pullman as partial charge scheme to generate CoMFA models. Default parameters were utilized to develop CoMFA and RF-CoMFA models. An sp 3 hybridized carbon as probe atom with +1 charge and a grid spacing of 2.0 Å was used. Statistically reasonable CoMFA and RF-CoMFA models were developed using the Partial Least Squares (PLS) regression. In the PLS analysis, CoMFA descriptors and biological activity values (pIC 50 ) were used as independent variables and dependent variables respectively. PLS analysis with Leave-one-out (LOO) crossvalidation was executed to evaluate the reliability of the generated models. PLS analysis was used to calculate the squared cross-validated correlation coefficient (q 2 ) value, an optimal number of components (ONC) and the standard deviation of prediction (SEP). A column filtering value of 2.0 was used. Based on the obtained ONC, non-crossvalidation analysis was then performed to calculate the squared correlation coefficient (r 2 ), F-test value (F) and standard error of estimate (SEE).
Similarly, RF-CoMFA model was generated using the PLS analysis obtained during CoMFA model development. RF-CoMFA is an iterative process that refines a built model by improving the weight for those lattice points which are most related to the model. This enhances the predictive capability of the PLS analysis used in RF-CoMFA.

Model Validation
The selected RF-CoMFA model was checked for predictive ability using different validation techniques such as bootstrapping, leave-five-out (LOF), an external test set validation, and rm 2 metric calculations [25]. Bootstrapping for 100 runs was performed to validate the model's predictability [38]. The models were also validated by the predictive correlation coefficient (r 2 pred ).

Design of New IRAK4 Inhibitors and ADMET Calculation
We have derived a design strategy based on the structural information obtained from the contour map analysis of selected RF-CoMFA models. We designed new eight compounds and further calculated their in-silico ADMET (absorption, distribution, metabolism, excretion and toxicity), pharmacokinetic properties using Artificial Intelligence based Insilico ADME/Tox prediction online tool (https://www.aidrug.re.kr/web/ accessed on 18 July 2022). This tool quickly and accurately predicts ADMET properties of molecules using the 2D structure of the molecule (SMILES code) that is extremely helpful in making decisions that can determine the success of designed compounds.

Conclusions
IRAK4 is a one of the important serine/threonine kinases that play fundamental role in cell signaling, inflammation, apoptosis, and cellular differentiation, which makes it an ultimate drug target for several types of cancers and autoimmune diseases. In this study, we have employed various molecular modeling techniques, such as molecular docking, MD simulation, and MM/PBSA binding free energy calculation, in order to examine and to identify the essential active site residues accountable for IRAK4 inhibition. A comprehensive investigation showed that active site residues Val200, Ala211, Tyr262, Val26, Met265 and Leu318 were important for the IRAK4 inhibition. It was concluded from the MM/PBSA binding free energy calculation that residues Tyr262, Leu318 and Met265 were found to be involved more in the total binding energy. Moreover, RF-CoMFA resulted in reasonable statistical models in terms of q 2 and r 2 (q 2 = 0.751, ONC = 4, r 2 = 0.911). The model was found to be predictive and reliable. Our docking and MD results were compatible with the analysis of contour maps produced using a chosen RF-CoMFA model, thereby it elucidated the structural features requisite to design more potent IRAK4 inhibitors. We proposed a new design strategy based on the overall analysis and acquired structural features to modify the ligand and designed eight IRAK4 inhibitors. Our designed IRAK4 inhibitors exhibited predicted activity (pIC 50 ) greater than the most potent compound of the diaminonicotinamide derivatives and their ADMET calculation showed promising results, which can be further evaluated using experimental studies for their specific contribution in the inhibition of IRAK4 as well as pharmacodynamics/pharmacokinetics properties. Our design scheme and predicted ADMET values could be useful for medicinal chemists or pharmaceutical companies to develop novel IRAK4 inhibitors.