1. Introduction
The fundamental paradigm that has characterized most of the research in the field of protein computational biology relies on the sequence–structure–dynamics–function relationship [
1]. The linear sequence of amino acids arranges itself into the tertiary structure of the protein through the complex chemo-physical phenomenon of protein folding [
2,
3]. The folded protein structure is not a static entity, but it is generally highly dynamic and flexible. The resulting dynamics of the protein obviously depends on the three-dimensional protein structure. Finally, it has been demonstrated that the dynamics and flexibility of the protein acts towards its functionality and biological mechanism [
1,
4,
5]. Therefore, the sequence of amino acids generates the protein structure, which in turn drives the biological function through its dynamic and flexibility features.
Several computational approaches have been suggested to reproduce the protein dynamics and vibrations correctly. Molecular dynamics (MD) and normal mode analysis (NMA) represent two of the most widely used methodologies [
6,
7]. The former looks at the complete trajectories of the atoms, whereas the latter investigates the small-amplitude normal modes of the system, which is characterized by multi-parameter harmonic potentials. Later on, it has been shown that simplified models relying only on a single-parameter harmonic potential, i.e., an ensemble of Hookean springs, are sufficient to describe the slow dynamics of the protein with good details [
8]. Then, it has also been found that even coarse-grained models, only based on the C
α atoms of the protein, are able to properly reproduce the protein slow dynamics as well as the fluctuations observed in the X-ray crystallographic experiments, i.e., the B-factors [
9,
10].
Here, we report the results that arise from the application of modal analysis to the elastic lattice model (ELM) of hen egg-white lysozyme, a well-known globular protein. The structural model has been developed within a finite element (FE) framework, and the outcomes have provided useful insights regarding the lysozyme flexibility, the biologically relevant motions as well as the vibrational frequencies. In particular, the latter have been found to lie in the (sub-)THz frequency range. More detailed information about this analysis, and the obtained results can be found in [
11].
2. Materials and Methods
The structural model was developed within the FE code LUSAS [
12] and relies on modelling the protein structure as an ELM, i.e., a 3D spatial truss. The elastic bars of the truss were generated by taking into account only the C
α atoms of the lysozyme (PDB: 4YM8) and by connecting those nodes that lie closer than a selected cutoff value. Five different cutoffs were selected, i.e., 8, 10, 12, 15, and 20 Å, in order to generate various ELMs and investigate the influence of such a geometrical parameter on the outcomes. The elastic bars were assigned the same axial rigidity
EA, where
E is the Young’s Modulus and
A the cross-sectional area. The value of rigidity was defined based on the comparison between the experimental B-factors and the computed ones (see below). The masses of all the ELM nodes were set equal among them, and equal to the total mass of the protein divided by the number of nodes [
11].
Based on the generated ELMs, modal analysis was carried out, by solving the multi-degree-of-freedom (MDOF) free-vibration problem within the FE environment. This implies solving the following eigenvalue–eigenvector equation:
K and
M being the global stiffness and mass matrix of the ELM, respectively, and
ωn and
δn the eigenvalue (angular frequency) and eigenvector (mode shape) related to the n
th vibrational mode. Note that, for an ELM counting
N nodes and which is not externally constrained, we obtain 3
N-6 non-zero eigenvalues and 3
N-6 non-rigid eigenvectors [
11]. The stiffness and mass matrix were computed within the FE environment as [
13]:
where
kb* represents the 2 × 2 stiffness matrix of the b
th elastic bar in the local reference system,
Nb is the 2 × 6 rotation matrix for the b
th bar between the local and global reference systems,
Cb is the 6 × 3
N expansion matrix, which is used to expand the stiffness matrix of the b
th bar to the global structural dimension,
m is the mass of each ELM node, and
I is the 3
N × 3
N identity matrix [
11].
One of the most used methods for the extraction of the normal modes from the coarse-grained protein crystal structure is the anisotropic network model (ANM) [
10]. The FE-based ELM described above can be seen as the counterpart of the ANM by following a purely structural mechanics approach. The main difference is that the ELM defined above also includes the quantitative information about the mass of the system, which is commonly not included in ANM calculations [
11]. Nevertheless, it can be easily demonstrated that the stiffness matrix of the ELM
K corresponds to the Hessian matrix of the ANM
H. The latter can be represented as a
N ×
N matrix, whose elements are 3 × 3 sub-matrices
Hij. These contain the second partial derivatives of the harmonic potential of the linear spring connecting nodes
i and
j (see [
10,
11,
14] for more details). In [
11], we have shown that there exists complete consistency between the ELM
K and the ANM
H. In particular, when the decay parameter of the ANM spring constant is set equal to 1 [
14], the two matrices are exactly the same.
Based on the eigenvalues and eigenvectors obtained from Equation (1), the B-factors can be numerically computed as [
11,
15]:
being
Bi the B-factor of the i
th node,
kB the Boltzmann’s constant,
T the absolute temperature,
δi,n the absolute displacement of the i
th node according to the n
th mode, and
ωn the angular frequency of the n
th mode shape. The B-factors obtained from Equation (3) can be directly compared with the experimental ones contained in the PDB file [
16]. Moreover, by posing that the average value of the experimental B-factors should match the average value of the computed ones, one can fix the axial rigidity
EA to be assigned to the elastic bars of the ELM.
3. Results and Discussion
Table 1 reports the structural features of the five generated ELMs, according to the selected cutoff value. It can be seen that, increasing the cutoff value, the number of elastic connections within the ELM increases, therefore the mean length of the elastic bars is greater. However, in order to have consistency between the average value of the numerical and experimental B-factors, the axial rigidity
EA assigned to the elastic bars needs to decrease as the cutoff increases. As a result, the axial stiffness of the mean elastic connection gets lower for higher cutoffs.
Table 1 also shows the obtained Pearson correlation coefficients between the experimental and numerical B-factors. As can be seen, the correlation is around 60–70%, and it tends to increase for higher cutoffs [
11].
From the analysis of the eigenvectors
δn, i.e., the mode shapes, it was found that the low-frequency vibrational modes obtained from the ELMs of lysozyme reproduce fairly accurately the biologically relevant motion of the lysozyme [
11]. Additionally, the ELM-based mode shapes were found to imply high flexibility in the same regions of the protein as previously obtained by Levitt et al. [
7], who made use of NMA in internal coordinates. The cutoff parameter was found to have a certain influence on the obtained mode shapes. As extensively reported in [
11], the mode shapes obtained from models with close cutoff values, i.e., 10, 12, and 15 Å, exhibit several common features. Conversely, using much lower and much higher values of the geometrical cutoff, i.e., 8 and 20 å, leads to some differences in the displacement field of the mode shapes. Nevertheless, all models were efficiently able to detect the regions expected to have the highest flexibility.
From Equation (1), the angular frequencies
ωn associated to each mode shape were also obtained, which are linearly related to the vibrational frequencies
fn (
ωn = 2
πfn).
Figure 1 shows the frequencies for the lysozyme ELM, depending on the selected cutoff value.
Figure 1a displays the entire set of frequencies according to the mode number, whereas
Figure 1b reports the distribution of the frequency values. As can be see, the vibrational frequencies lie in the (sub-)THz range. The lowest frequency ranges from 0.046 THz for Model A up to 0.117 THz for Model E [
11]. It can be inferred that increasing the cutoff value leads to higher values of the frequency. This phenomenon occurs up to about the 50th mode. Afterwards, it can be seen that higher values of the frequency are obtained for the lower cutoff values. Finally,
Figure 1b shows that the histograms of frequencies, which resemble Gaussian-like distributions. This seems to be a common feature to all globular proteins, as shown in [
17,
18]. It is also evident that, adopting higher cutoff values for the protein ELM, the whole range of obtained frequencies becomes remarkably narrower.