Probing Conformational Dynamics by Protein Contact Networks: Comparison with NMR Relaxation Studies and Molecular Dynamics Simulations

Protein contact networks (PCNs) have been used for the study of protein structure and function for the past decade. In PCNs, each amino acid is considered as a node while the contacts among amino acids are the links/edges. We examined the possible correlation between the closeness centrality measure of amino acids within PCNs and their mobility as known from NMR spin relaxation experiments and molecular dynamic (MD) simulations. The pivotal observation was that plasticity within a protein stretch correlated inversely to closeness centrality. Effects on protein conformational plasticity caused by the formation of disulfide bonds or protein–protein interactions were also identified by the PCN analysis measure closeness centrality and the hereby introduced percentage of closeness centrality perturbation (% CCP). All the comparisons between PCN measures, NMR data, and MDs were performed in a set of proteins of different biological functions and structures: the core protease domain of anthrax lethal factor, the N-terminal RING domain of E3 Ub ligase Arkadia, the reduced and oxidized forms of human thioredoxin 1, and the ubiquitin molecules (Ub) of the catalytic Ub–RING–E3–E2–Ub complex of E3 ligase Ark2.The graph theory analysis of PCNs could thus provide a general method for assessing the conformational dynamics of free proteins and putative plasticity changes between different protein forms (apo/complexed or reduced/oxidized).


Introduction
Proteins are largely dynamic molecules with their overall conformation affecting their biological function. Crystallographic B-factors, NMR spin relaxation measurements, and molecular dynamic (MD) simulations provide insights into protein flexibility on an atomic scale. Specifically, NMR spin relaxation experiments provide site-specific information about backbone and sidechain dynamics throughout the protein. Nowadays, established NMR procedures allow for routine extraction of motional description, or parameterization in terms of amplitudes (S 2 order parameters) and characteristic timescales (τi) [1,2]. NMR experiments based on the methyl-TROSY effect that significantly improves spectral resolution and sensitivity have been used for the study of protein conformation of supramolecular complexes with molecular masses extending to 1 MDa [3]. However, protein parts of increased flexibility will not diffract long enough and may not be visible in crystal structures. Methods that probe into the protein dynamics of these undefined parts are largely computational and may (i) use automated protocols for the identification of functional dynamics from X-ray structures [4]; (ii) classify proteins according to their mobility patterns, improving annotation of protein function on the basis of protein dynamics [5]; Biophysica 2021, 1 158 and (iii) predict conformational heterogeneity by the combination of residue coevolution and MDs [6].
Network analysis of protein structures has been very useful for the analysis of structural stability of macromolecules in the last decade [7][8][9][10][11][12]. The "protein structure networks", known as "amino acid networks" or "protein contact networks" (PCNs) are constructed by nodes and edges. These networks consider amino acids and their respective interactions when their spatial distances within a protein structure are under a defined threshold (<7 Å i.e.,). The idea of essential contacts in a protein molecule was the core of an earlier methodologies such as (i) the lattice model of proteins that was introduced for studying the statistical mechanical characteristics of protein folding, unfolding, and fluctuations [13]; (ii) the "contact of structural units" such as helices, sheets, strands, and residues, embodied in the widely used contacts of structural units (or CSU) software [14]; and (iii) the "elastic network model" that investigates the magnitudes of motion of individual residues in the protein structure [15] and formed the basis of coarse-grained MDs studies [16].
Protein structure networks and their topological analysis are good tools for the analysis of protein folding [17,18], dynamics [19,20], and identification of key residues for proteinprotein interactions [21][22][23][24]. In the theory of protein structure networks, "centrality" describes the importance of an amino acid position within a network. The most frequently used centrality measures are "degree", "closeness", "betweenness", and "eigenvector" [25][26][27]. Betweenness and closeness are most useful in identifying crucial residues for protein folding and function [28][29][30]. Approaches utilizing protein networks have been recently explored to evaluate the dynamics of systems (from individual molecules to large protein assemblies) whose structures have been resolved [31] in order to generate information about ligand binding sites, protein-protein and/or protein-DNA interfaces, signal transduction, and allosteric communication mechanisms [31][32][33][34].
When proteins contact each other to form complexes, conformational changes may occur in dynamic/flexible parts of the corresponding contact areas. In the most of these cases, protein complexes may be too large for NMR spin relaxation experiments. Proteins may also undergo structural modifications resulting from interaction with other proteins or by changes within the same protein structure (formation of disulfide bridges), leading to altered protein dynamics in solution. The present study illustrates the use of PCN analysis as a suitable tool for qualitative evaluation of the conformational dynamics of free proteins as well as putative plasticity changes between different protein forms (apo/complexed or reduced/oxidized).

The Collection of PDB Structures
The names of proteins and Protein Data Bank (PDB) IDs used for the amino acid network reconstructions are summarized in Table 1. UbR and UbD of the complex Ub R -RING-E2-Ub D 5D0K

Construction and Centrality Measurements of Protein Structure Networks
Networks of proteins structures were built using the NAPS (Network Analysis of Protein Structures) web server (https://bioinf.iiit.ac.in/NAPS/help.html, accessed on 10 August 2020) [7]. In all constructed amino acid networks, every Cα atom of an amino acid was considered as a node. Edges corresponded to distances between different Cαs if distances between the corresponding amino acid residues were ≤~7 Å and were weighed according to the standard function wij = 1/dij (wij: edge weight and dij: distance between nodes i and j). Standard and long-range protein structure networks were constructed for each protein. The long-range networks were structured exclusively by edges between residues that were sequentially separated by 10 residues along the protein backbone chain.
In the protein contact networks, the closeness (or closeness centrality: the inverse of the shortest path distances of the node to all other nodes in the network) was measured. All topological analyses of the PCNs were performed using Cytoscape (https://cytoscape.org/, accessed on 10 August 2020) [35] and the NAPS web server [7]. To compare the closeness (closeness centrality) measures/values of PCNs between 2 different protein forms, we introduced the parameter "percentage of closeness centrality perturbation per residue" (% CCP) for each Cα of the polypeptide chain as In the case of reduced and oxidized human thioredoxin-1 (hTrx1), protein forms A and B correspond to reduced and oxidized protein, respectively. In the case of the 2 Ubs of the Ub R -RING-E2-Ub D complex, protein forms A and B correspond to Ub R and Ub D , respectively.

NMR Spin Relaxation Measurements and MDs
15 N NMR relaxation data (S 2 order parameters for the N-H backbone vector) for anthrax lethal factor core protease and N-terminal RING domain of E3 Ub ligase Arkadia were provided by the previous reported studies [36]. All NMR data were acquired at 25 • C on a Bruker Avance 600 MHz spectrometer equipped with a TXI cryoprobe. MDs of anthrax lethal factor core protease and reduced (PDB ID: 1ERT) and oxidized (PDB ID: 1ERU) human thioredoxin-1 (hTrx1) were from [36,37], respectively. MDs of N-terminal RING domain of E3 Ub ligase Arkadia were performed according to standard protocols described in detail previously [38]. The selected structure for simulations was the NMR solution structure of the RING domain (PDB ID:2KIZ). MDs were performed using GRO-MACS software (version 2016.3) [39] and AMBER-99SB-STAR-ILDNP force field for 150 ns simulation time. NMR assignments of anthrax lethal factor core protease and N-terminal RING domain of E3 Ub ligase Arkadia used in this work were deposited in the Biological Magnetic Resonance Bank (BMRB) under the entry codes 16735 and 15948, respectively.

PCNs Followed a Non-Scale Free Connectivity Distribution
The PCNs of the examined proteins displayed the general bell-shaped Poissonian behavior of other known protein structure networks (Figure 1a). This characteristic of PCNs is a marked exception to most self-organized biological networks (protein-protein and metabolic networks), which have scale-free behavior [19,27,[40][41][42][43]. In the scale-free networks, the majority of nodes are each connected to a limited number of other nodes, whereas only a few nodes (hubs) are connected to many nodes. The scale-free nature of networks may explain their robustness in certain physical and biological situations [44]. It is likely that the main reason for the deviation of the protein structure networks from other biological networks (e.g., protein-protein interaction maps) in terms of scale-free connectivity distribution is the characteristic average degree of a given amino acid due to their limited binding capacity (volume effect) [19,45]. When only long-range interactions were considered (distances between Cαs were ≥5 Å), the distribution of PCNs had a more scale-free behavior (Figure 1b), typical of most biological networks. It is possible that this is the case for other protein structure networks. The stability of a given NMR structure for example is dictated by long range nuclear Overhauser effects (NOEs) and not by the interactions of sequential amino acids. their limited binding capacity (volume effect) [19,45]. When only long-range interactions were considered (distances between Cαs were ≥5 Å), the distribution of PCNs had a more scale-free behavior (Figure 1b), typical of most biological networks. It is possible that this is the case for other protein structure networks. The stability of a given NMR structure for example is dictated by long range nuclear Overhauser effects (NOEs) and not by the interactions of sequential amino acids.

Protein Plasticity Correlated Inversely to Closeness Centrality.
Network metrics for anthrax lethal factor core protease domain (PDB ID: 2L0R) [36] were plotted against 15 N NMR relaxation data (S 2 order parameters for the N-H backbone vector) and RMSF (root mean square fluctuation) parameters from MD simulations (Figure 2) [36,46,47]. Closeness centrality measures of N-terminal RING domain of E3 Ub ligase Arkadia (PDB ID: 2KIZ) were compared with 15 N NMR relaxation data and RMSF parameters (Figure 3) [48][49][50][51]. These two proteins have completely different biological function and structure. In both cases, lower closeness centrality measures were correlated to increased mobility (elevated RMSF values or reduced S 2 parameters) of a protein segment and vice versa. Differences between NMR relaxation experiments and MDs and PCNs analysis were detected only for the loop (G 722 -Y 728 ) between helix α5 and α6 of the anthrax lethal factor core protease domain. This loop consists of residues with reduced number of long-range NOE connectivities [36] and is probably the reason why the NMR relaxation studies indicate higher flexibility for this region. Thus, results from different protein structures taking into consideration NMR and MD data suggested that elevated closeness centrality measures correlated to lower mobility of amino acids within a protein segment.

Protein Plasticity Correlated Inversely to Closeness Centrality
Network metrics for anthrax lethal factor core protease domain (PDB ID: 2L0R) [36] were plotted against 15 N NMR relaxation data (S 2 order parameters for the N-H backbone vector) and RMSF (root mean square fluctuation) parameters from MD simulations ( Figure 2) [36,46,47]. Closeness centrality measures of N-terminal RING domain of E3 Ub ligase Arkadia (PDB ID: 2KIZ) were compared with 15 N NMR relaxation data and RMSF parameters (Figure 3) [48][49][50][51]. These two proteins have completely different biological function and structure. In both cases, lower closeness centrality measures were correlated to increased mobility (elevated RMSF values or reduced S 2 parameters) of a protein segment and vice versa. Differences between NMR relaxation experiments and MDs and PCNs analysis were detected only for the loop (G 722 -Y 728 ) between helix α5 and α6 of the anthrax lethal factor core protease domain. This loop consists of residues with reduced number of long-range NOE connectivities [36] and is probably the reason why the NMR relaxation studies indicate higher flexibility for this region. Thus, results from different protein structures taking into consideration NMR and MD data suggested that elevated closeness centrality measures correlated to lower mobility of amino acids within a protein segment. Biophysica 2021, 1, FOR PEER REVIEW 5

Percentage CCP Changes May Indicate Alterations in Plasticity upon Formation of Disulfides within A Protein Sequence
Percentage CCP values between reduced (PDB ID: 1ERT) and oxidized (PDB ID: 1ERU) human thioredoxin-1 (hTrx1) were calculated and related to changes in the corresponding RMSF parameters from MD simulations [37]. MD simulations (for 10 ns) for the oxidized and reduced hTrx1 (Figure 4a) showed that the reduced molecule had increased Cα RMSFs (increased flexibility) for the loops between β2 and α2 (L3), between β3 and α3 (L5), and between α3 and β4 (L6). L3 contains the active site cysteines (Cys 32 -Gly-Pro-Cys 35 ), and thus reduction of the disulfide conferred increased flexibility to the region. Network-based comparison of the reduced versus oxidized hTrx1 revealed that the % CCPs of parts without any fluctuation changes (β1, α1, β2, as revealed by MD simulations) were in the order of ±0.3. The % CCP value of ±0.33 was therefore set as the relative threshold-noise value that included all non-significant % CCP changes. Under this consideration, significant negative % CCP values were observed for the three loops with increased flexibility (L3, 5, and 6; Figure 4b), explained by the lower plasticity of oxidized hTrx1 compared to the more flexible reduced form. The segments with negative % CCPs (with values more than threefold above the threshold (±0.33) corresponded to 0.5-0.7 Å increases in the RMSF values (MD simulations).

Percentage CCP Changes May Indicate Alterations in Plasticity upon Formation of Disulfides within A Protein Sequence
Percentage CCP values between reduced (PDB ID: 1ERT) and oxidized (PDB ID: 1ERU) human thioredoxin-1 (hTrx1) were calculated and related to changes in the corresponding RMSF parameters from MD simulations [37]. MD simulations (for 10 ns) for the oxidized and reduced hTrx1 (Figure 4a) showed that the reduced molecule had increased Cα RMSFs (increased flexibility) for the loops between β2 and α2 (L3), between β3 and α3 (L5), and between α3 and β4 (L6). L3 contains the active site cysteines (Cys 32 -Gly-Pro-Cys 35 ), and thus reduction of the disulfide conferred increased flexibility to the region. Network-based comparison of the reduced versus oxidized hTrx1 revealed that the % CCPs of parts without any fluctuation changes (β1, α1, β2, as revealed by MD simulations) were in the order of ±0.3. The % CCP value of ±0.33 was therefore set as the relative threshold-noise value that included all non-significant % CCP changes. Under this consideration, significant negative % CCP values were observed for the three loops with increased flexibility (L3, 5, and 6; Figure 4b), explained by the lower plasticity of oxidized hTrx1 compared to the more flexible reduced form. The segments with negative % CCPs (with

CCP Changes May Indicate Alterations in Plasticity upon Complexation
Human Ark2C is the only known RING E3 ligase. The protein participates in an alternative pathway for the activation of Ub transfer. Its RING domain binds to both E2-Ub and free Ub with high affinity, resulting in a catalytic active Ub R -RING-E2-Ub D complex with two Ubs [52]. MD simulations have shown that in the catalytic complex, the Ub bound to the RING domain (Ub R ) had greater mobility than the Ub bound to E2 (Ub D ) [38]. Calculations of % CCP between the two Ubs of the active Ub R -RING-E2-Ub D complex revealed different plasticities for the two molecules ( Figure 5). Therefore, Ub R is likely to have increased flexibility compared to Ub D , which is in agreement with the previous MD simulations. values more than threefold above the threshold (±0.33) corresponded to 0.5-0.7 Å increases in the RMSF values (MD simulations).  with two Ubs [52]. MD simulations have shown that in the catalytic complex, the Ub bound to the RING domain (UbR) had greater mobility than the Ub bound to E2 (UbD) [38]. Calculations of % CCP between the two Ubs of the active UbR-RING-E2-UbD complex revealed different plasticities for the two molecules ( Figure 5). Therefore, UbR is likely to have increased flexibility compared to UbD, which is in agreement with the previous MD simulations.

Discussion
The determination of the three-dimensional structure of proteins and their interactions and dynamics are paramount to structural biology. Functional dynamics provide vital information on the identification of key residues for protein-protein interactions and on structure-activity relationships that are of interest to many fields, e.g., pharmacology. Going beyond the static picture of protein structure has proven more challenging. Techniques such as NMR relaxation, fluorescence spectroscopy [53], and time-resolved X-ray crystallography [54] have helped. Integrations of the above experimental tools with computational approaches including MDs or network-based lattice or elastic models have allowed for the study of functional dynamics of proteins at atomic resolution in the time range of microseconds.

General Outcomes from the Measurements of Centrality Parameters
In any protein network, the length of the shortest path among different amino acids and their individual fluctuations are considered related [17,30]. As the shortest path has been described by the protein network parameter closeness centrality [17,18], the possible correlation between the closeness centrality measure of amino acids within PCNs, and their mobility, as known from NMR spin relaxation experiments and MDs, was explored.
Elevated closeness centrality values correlated to low mobility of a protein segment in the proteins examined. Positive or negative values of the introduced parameter % CCP reflected protein segments with reduced or increased intrinsic rigidity, respectively. The % CCP values (above the threshold) were able to distinguish subtle differences of Cα RMSFs in the whole structure of an oxidoreductase resulting from formation of a disulfide in its active site. Moreover, % CCP values were able to probe the different mobility profiles of the two different Ub molecules in the catalytic assembly Ub R -RING-E2-Ub D and showed, in agreement with the previous MDs [38], that Ub R bound to the RING was more flexible than Ub D bound to E2. Significant negative % CCP values were observed for the regions of Ub R with meaningful increased flexibility compared to Ub D . These data suggest that centrality measures of PCNs can detect fine flexibility changes in parts of protein structure occurring after interaction with other proteins or after small redox changes within their amino acid sequences.

Conclusions
Our findings suggest that novel in silico packages could be designed and developed considering (i) quantitative correlations between PCN centrality parameters and NMR relaxation measurements and MD simulations, and (ii) quantitative metrics, which can report on the correspondence of the network measures to the stability of protein structures, changes of plasticity in the exact contact areas of proteins, and hot spot residues in proteinprotein interfaces.