Conformational Insight on WT- and Mutated-EGFR Receptor Activation and Inhibition by Epigallocatechin-3-Gallate: Over a Rational Basis for the Design of Selective Non-Small-Cell Lung Anticancer Agents

Non-small cell lung cancer (NSCLC) represents a difficult condition to treat, due to epidermal growth factor receptor (EGFR) kinase domain mutations, which lead to ligand-independent phosphorylation. Deletion of five amino acids (ELREA) in exon 19 and mutational change from leucine to arginine at position 858 (L858R) are responsible for tyrosine kinase domain aberrant activation. These two common types of EGFR-mutated forms are clinically associated with high response with Tyrosine Kinase Inhibitors (TKI); however, the secondary T790M mutation within the Tyrosine Kinase Domain (TKD) determines a resistance to these EGFR-TKIs. Using molecular dynamic simulation (MD), the present study investigated the architectural changes of wild-type and mutants EGFR’s kinase domains in order to detect any conformational differences that could be associated with a constitutively activated state and thus to evaluate the differences between the wild-type and its mutated forms. In addition, in order to evaluate to which extent the EGFR mutations affect its inhibition, Epigallocatechin 3-Gallate (EGCG) and Erlotinib (Erl), known EGFR-TKI, were included in our study. Their binding modes with the EGFR-TK domain were elucidated and the binding differences between EGFR wild-type and the mutated forms were evidenced. The aminoacids mutations directly influence the binding affinity of these two inhibitors, resulting in a different efficacy of Erl and EGCG inhibition. In particular, for the T790M/L858R EGFR, the binding modes of studied inhibitors were compromised by aminoacidic substitution confirming the experimental findings. These results may be useful for novel drug design strategies targeting the dimerization domain of the EGFR mutated forms, thus preventing receptor activation.


EGFRs structure modeling (SM)
The conformation of the small lacking loops between the crystallized domains of EGFR-TK receptor, and of the other fragments of EGFR has been well defined using a comparative modeling approach. On the contrary, a disordered structure corresponds to the small 16aa loop (Tyr975-Asp991) (yellow in the figure below), but this part is retrieved from the crystallographic structures 5cnn and 4rj4. Worth of notice, the sequence (Pro848-Pro877), is retrieved for the pdb template, since it is included as the final part of the TK-crystallized domains.
This loop is directly involved in an important aa mutation (Leu858-Thr858), which confers to the EGFR drug resistance; thus, this portion associated with high degrees of freedom, must be extensively explored through a MD simulation approach. In addition, the structure of the missing aminoacids (Thr892-Tyr1197) and other flexible domains were also obtained.

Post MD analyses
Specific studies aimed to better understand the movements of the dimerization domain of the three EGFR forms have been carried out, and the mean square displacement (MSD) of dimerization domain from the set of initial positions has been computed. This analysis provided us an easy way to determine the diffusion constant using the Einstein relation, which is a predictive relation about diffusive movements of particles subjected to a force field. It can be expressed in the form

Where:
D is diffusive coefficient of matter  is mobility of particle, which is a ratio between speed limit and a force applied to it kB is the Boltzmann constant T is the temperature (kelvin)

Comparison between 4HJO pdb structure and wild type EGFR model.
The generated EGFR wild type model was superimposed on tyrosin kinase domain reported in 4HJO pdb structure with root mean square deviation (RMSD) of 0.591 Å, showing a close homology ( Figure S1A). The Int. J. Mol. Sci. 2019, 20, x FOR PEER REVIEW 2 of 5 geometry of entire cytoplasmic EGFR model was evaluated with Ramachandran plot calculation using RAMPAGE. Ramachandran plot revealed that 82 % of residues are in favored region, 14.4% are in additionally allowed region and 4.6% are in outlier region ( Figure S1B). From Verify 3D analysis, it results 93.10% of the residues have averaged 3D-1D score >= 0.2. Figure S1. Superimposition of wild type EGFR model (orange) and 4HJO pdb template (blue); in dark blue is reported Erlotinib molecule in the ATP binding site (A). Ramachandran plots of wild type EGFR model.

RMSD results on wilt type, T790M/L858R EGFR cytoplasmic models
The root mean square deviation for the backbone atoms of wild type, T790M/L858R and ELREA deletion EGFR forms through the entire MD trajectory (RMSD vs Time) was calculated, and a stabilization was reached for all the EGFR forms at the end of the simulation time. In fact, the reached RMSD values (figure 2) were very low, as the systems converged to 1 ± 0.05 nm (wild type model), 1.01 ± 0.04 nm (T790M/L858R model) and 0.87 ± 0.07 nm (ELREA Deletion model). In the first 20 ns of the simulation trajectories, all EGFR systems had RSMD values comparable between them, reaching a common steady state and indicating a high stabilization degree. This result is very important because it allowed us to validate our prediction method, because the RMSD values of around 1 nm for the entire protein structure 450 aa long (445 aa for ELREA model) are relative to stable systems.  The same values of RMSD were observed along simulation time, and TK domain confirmed to be an ordered rigid domain, with RMSD values lower than 0.3 nm for all triplicates (figure S3A). The D domain showed a more flexibility, with RMSD values higher than 1.25 nm (figure S3B). However, the same values were found for triplicates, confirming the good degree of prediction.

Combined molecular docking/Molecular dynamics approach
The EGFR cytoplasmic forms obtained after MD stabilization (corresponding to the steady state) were used for molecular docking, at the aim to create the EGFR inactive forms in complexes with ligands considered.

wild type EGFR complexes
Analyzing the Michaelis complexes obtained after 20 ns of MD trajectories, some important differences have been found comparing Erl and EGCG dynamical binding mode. In the last 5 ns of simulations, Erl showed a RMSD value of 0.13 ± 0.05 Å, while EGCG showed a value of 0.12 ± 0.01 Å. (Figure 3A) The substantial difference is related to the estimated error, which is 5 times higher than that seen for system with EGCG. This means a significantly higher fluctuation for Erl than that observed for EGCG in the ATP binding site. From these data, EGCG showed a more efficient binding mode respect to Erl, as confirmed by RMSD value of total backbone of EGFR in association with two ligands studied ( figure 3B). The EGFR-Erl complex showed values of 1.5 ± 0.1 nm in the last 5 ns of simulation, while the EGFR-EGCG system highlighted a 0.90 ± 0.08 nm RMSD value. EGFR-EGCG complex showed a major stability respect to that obtained with Erl, due to important hydrophobic and hydrophilic stable interactions with different aminoacids.

ELREA EGFR complexes
Analysing the RMSD values of ligands in association with ELREA EGFR structure, Erl adopted a new stable conformation after 10 ns of MD simulation (figure S4A), reached an RMSD value of 0.13 ± 0.01 Å. Different situation was found for EGCG binding mode; the catechin molecule adopted the same conformation analyzing MD trajectories, but no particular stability was found, because the RMSD value reached 0.18 ± 0.04 Å, with a higher extimated error, and then an higher fluctuation degree in the ATP binding siteThis different behavior of two ligands influenced the stabilization pathway of the entire ELREA deletion EGFR ( figure 4B). The complex with Erl reached stability after 10 ns of MD simulation, in relation to the new stable conformation adopted for the ligand, with an RMSD value of 0.34 ± 0.01 nm. Similarly, EGCG determined different stabilization pathway for receptor, with RMSD value for the last 5 ns was 0.72 ± 0.06 nm. Figure S5. The root mean square deviations (RMSD) of ligands (a) and ELREA deletion EGFR backbone in complex with ligands (b) have been highlighted.

T790M/L858R EGFR complexes
Analysing the RMSD data obtained from MD simulation, two different stabilization pathways have been found for Erl and EGCG (figure S5A). The first compound showed a completely different conformation adopted after 8 ns of MD simulation, reaching an RMSD value of 0.22 ± 0.03 Å. The EGCG molecule showed the same RMSD value for all 20 ns of simulation, indicating no variations in the binding mode. EGCG appeared more able than Erl to efficacy interact with resistant EGFR form, but low hydrophobic contribution in the binding energy was observed. No peculiar results have been obtained analysing the RMSD value for the backbone of EGFR in complex with the two ligands (figure 5B). Both of systems reached average RMSD values of about 0.8 nm at the end of simulation but, with important deviations during the last 10 ns, showing no efficent stabilization pathway. These data and the low hydrophobic contribution found for both EGFR-Erl and EGFR-EGCG complexes indicated that neither of the two ligands studied seemed to be able to efficacy inhibit the T790M/L858R resistant form of the EGFR-TK domain.