Allostery and Epistasis: Emergent Properties of Anisotropic Networks

Understanding the underlying mechanisms behind protein allostery and non-additivity of substitution outcomes (i.e., epistasis) is critical when attempting to predict the functional impact of mutations, particularly at non-conserved sites. In an effort to model these two biological properties, we extend the framework of our metric to calculate dynamic coupling between residues, the Dynamic Coupling Index (DCI) to two new metrics: (i) EpiScore, which quantifies the difference between the residue fluctuation response of a functional site when two other positions are perturbed with random Brownian kicks simultaneously versus individually to capture the degree of cooperativity of these two other positions in modulating the dynamics of the functional site and (ii) DCIasym, which measures the degree of asymmetry between the residue fluctuation response of two sites when one or the other is perturbed with a random force. Applied to four independent systems, we successfully show that EpiScore and DCIasym can capture important biophysical properties in dual mutant substitution outcomes. We propose that allosteric regulation and the mechanisms underlying non-additive amino acid substitution outcomes (i.e., epistasis) can be understood as emergent properties of an anisotropic network of interactions where the inclusion of the full network of interactions is critical for accurate modeling. Consequently, mutations which drive towards a new function may require a fine balance between functional site asymmetry and strength of dynamic coupling with the functional sites. These two tools will provide mechanistic insight into both understanding and predicting the outcome of dual mutations.


Introduction
A growing body of data on the human genome suggest that within the exome (the protein coding region), one individual may possess 10,000 or more non-synonymous nucleotide variants, many of which occur at positions which are not evolutionarily conserved [1][2][3]. Predicting the functional outcome of mutations at non-conserved sites remains an extremely difficult challenge. In particular, providing accurate predictions about the impact of these variations is difficult when only considering single, independent point mutations without accounting for the background of other positions and their chemical specificity (i.e., context dependence).
One reason why predicting the impact of mutations may fail is that extensive epistasis occurs during evolution [4][5][6]. Epistasis is defined as a context-dependent functional outcome, where, the alternative context could be just one single amino acid difference, or it could be a paralog with 25% sequence identity. Experimentally, epistasis manifests as a non-additive outcome from two or more amino acid changes within a protein. The effects can be dramatic. For example, a substitution may only confer a beneficial effect upon fixation of a second-site, also known as a "permissive" change; conversely, a neutral substitution might become deleterious in the presence of other "restrictive" when two mutational sites are simultaneously perturbed by random forces versus the response when individual force perturbations are exerted one at a time to the mutational sites.
In order to determine whether EpiScore can identify the degree and strength of epistatic relationships between position pairs, we first applied our analysis to the deep scanning database of double mutations between all positions in the IgG-binding domain of protein GB1 [24]. These modern, high-throughput screens (e.g., deep mutational scans) assay large numbers of mutants (up to 10 8 ) [34][35][36][37], but the information is largely qualitative. Therefore, to further test whether our approach can identify epistatic relations which specifically emerge during the evolution of new function, we applied our methodology to two different protein systems where the traditional biochemical quantifications of mutational effects (e.g., k cat , K M , IC 50 ) for a range of substrates are available. These two systems, P. falciparum DHFR (pfDHFR) and a β-lactamase (TEM-1), naturally confer resistance to drugs and the trajectories of these resistances as well as their epistatic relationships have been explored [22,23]. Importantly, these two systems are also known to be allosteric proteins.
We first observed that EpiScore can distinguish positive and negative epistasis in dual mutations when analysis was performed over 1045 single mutants and 509,963 double mutants of GB1. We also found that the average EpiScore value correlates well with experimental epistatic measures calculated using pyrimethamine IC 50 values of pfDHFR dual mutants and the catalytic turnover rates for cefotaxime of TEM-1 dual mutants. Furthermore, each pfDHFR amino acid pair exhibits distinct distributions of EpiScore values showing the importance of how these two positions communicate with the active site through the anisotropic interaction network.
Interestingly, DCI is usually not symmetric, i.e., the fluctuation response of position i upon exerting random forces on j is not identical to the response of j when position i is perturbed; we calculate this asymmetry with DCI asym . We applied our DCI asym analysis to the TEM-1 dual-mutant sites and found that, indeed, a relationship exists between dynamic coupling asymmetry and EpiScore when all active sites in the TEM-1 system are considered. Specifically, two of the three dual mutant positions which exhibited the largest positive epistasis in cefotaxime k cat /K M from the wild-type had both EpiScore values < 1 (indicating strong non-additivity) with respect to active site S70. Additionally, these dual mutants also exhibit asymmetry in dynamic coupling based on DCI asym , with consistent unidirectionality from active sites site to mutation sites in long range communication. We propose that this communication directionality signature should be readily apparent in known allosteric systems as mentioned above. Therefore, we applied a similar analysis to a Pin1 protein well-studied for its dynamic allostery and showed that the DCI asym between the catalytic binding sites and non-catalytic distal binding sites presents a unique directionality in long distance dynamic coupling, leading to a cause-and-effect relationship between allosteric sites and active sites also observed in epistatic interactions.

Methods
We previously designed a unique way to capture site-specific coupling between residue pairs or groups of residues, the Dynamic Coupling Index (DCI). The underlying premise behind DCI is the importance of a system's response to a force perturbation, be that protein-solvent, protein-protein, protein-ion or protein ligand interactions. Additionally, the point mutations here are modeled by the response of a system to a perturbation at a specific site, a.k.a. a single amino acid.
DCI is a combination of the Elastic Network Model (ENM) and Linear Response Theory (LRT) where the protein is modeled by representing the amino acids as nodes in a network connected by Hookean springs (Figure 1). The interaction between two amino acids close in space due to their 3-dimensional structure is represented by a simple harmonic function. A random Brownian kick in the form of a unit force perturbation is applied to an individual position which generates a response propagating through the rest of the structure, causing other positions to respond to this perturbation through the network of interactions. Using LRT, we can calculate the fluctuation response ∆R (Equation (1)) of each position and create response vector that measures the magnitude and direction (x, y and z) of displacement of every residue from its mean. As mentioned above, this (to the first order) mimics the effects of in vivo interactions of a protein. For example, a ligand binding event will apply a force to residues in the binding pocket of a receptor protein. In our perturbation residue scanning (PRS) approach, this is averaged over many unit force directions to simulate an isotropic perturbation. [ Entropy 2020, 22, x FOR PEER REVIEW 4 of 18 (x, y and z) of displacement of every residue from its mean. As mentioned above, this (to the first order) mimics the effects of in vivo interactions of a protein. For example, a ligand binding event will apply a force to residues in the binding pocket of a receptor protein. In our perturbation residue scanning (PRS) approach, this is averaged over many unit force directions to simulate an isotropic perturbation. Here, each residue within the structure is represented as a single node at the Cα position, connected to other nodes via Hookean springs. Using a combination of Perturbation Response Scanning (PRS) and Linear Response Theory (LRT) [38,39], each residue is perturbed by a Brownian kick applied as an isotropic external force which then generates a fluctuation response in all other residues within the network. This figure was rendered in PyMol [40].
H is the Hessian, a 3N × 3N matrix which can be constructed from 3-dimensional atomic coordinate information where it is composed of the second order derivatives of the harmonic potential energy with respect to the components of the position vector of length 3N. The Hessian matrix can be extracted directly from molecular dynamics simulations as the inverse of the covariance matrix. This method allows one to implicitly capture specific physiochemical properties and more accurate residue-residue interactions via atomistic force fields and subsequent all-atom simulation data. However, for the purposes of this paper, we wished to investigate only those relationships which could be derived solely from inter-atomic distances of single protein structures and thus we used the ENM version of our approach.
Repeating this process, each position in the structure is perturbed sequentially to generate a perturbation response matrix A Here, each residue within the structure is represented as a single node at the Cα position, connected to other nodes via Hookean springs. Using a combination of Perturbation Response Scanning (PRS) and Linear Response Theory (LRT) [38,39], each residue is perturbed by a Brownian kick applied as an isotropic external force which then generates a fluctuation response in all other residues within the network. This figure was rendered in PyMol [40].
H is the Hessian, a 3N × 3N matrix which can be constructed from 3-dimensional atomic coordinate information where it is composed of the second order derivatives of the harmonic potential energy with respect to the components of the position vector of length 3N. The Hessian matrix can be extracted directly from molecular dynamics simulations as the inverse of the covariance matrix. This method allows one to implicitly capture specific physiochemical properties and more accurate residue-residue interactions via atomistic force fields and subsequent all-atom simulation data. However, for the purposes of this paper, we wished to investigate only those relationships which could be derived solely from inter-atomic distances of single protein structures and thus we used the ENM version of our approach.
Repeating this process, each position in the structure is perturbed sequentially to generate a perturbation response matrix A where ∆R j i = (∆R) 2 is the magnitude of fluctuation response at position i due to the perturbations at position j. From this perturbation response matrix, we can construct DCI. DCI ij , then, represents the displacement response of position i upon perturbation to a single functionally important position (or subset of positions) j, relative to the average fluctuation response of position i when all of the positions within a structure are perturbed.
As such, DCI can be considered a measure of the dynamic coupling between residue i and residue(s) j upon perturbation to residue(s) j.
It is often more convenient to represent DCI as a percentile rank, where m ≤i is the number of positions with a DCI value ≤ DCI ij for a system of N residues. One of the most important aspects of DCI is that the entire network of interactions is explicitly included in subsequent calculations without the need of dimensionality reduction such as Normal Mode Analysis through principal component analysis. If one considers interactions such as allostery as an emergent property of an anisotropic interaction network, it is critical to include the interactions of the entire network to accurately model the effect one residue can have on another.
Here, we present two further extensions of DCI which allow us to uniquely model allosteric interactions and epistatic effects; EpiScore and DCI asym , respectively. EpiScore can identify or describe potential non-additivity in substitution behavior between residue pairs. This metric can capture the differences in a normalized perturbation response to a position k when a force is applied at two residues i and j simultaneously versus the average additive perturbation response when each residue i, j, is perturbed individually (Figure 2). EpiScore values < 1 (>1) indicate that the additive perturbations of positions i and j generates a greater (lesser) response at position k than the effect of a simultaneous perturbation. This means that, when treated together with a simultaneous perturbation at both sites i and j, the displacement response of k is lower (higher) as compared to the average effect of individual perturbations to i and j, one at a time. As EpiScore is a linear scale, the further the value from 1, the greater the effect described above.
Interestingly, through the use of DCI we can capture asymmetry between different residues within a protein, as coupling in and of itself is asymmetric within an anisotropic network. That is, each amino acid has a set of positions to which it is highly coupled, and this anisotropy in connections gives rise to unique differences in coupling between a given i j pair of amino acids which do not have direct interactions (Figure 3). DCI asym , then, is simply DCI ij (the normalized displacement response of position j upon a perturbation to position i) − DCI ji (Equation (5)). Using DCI asym we can determine a cause-effect relationship between the i j pair in terms of force/signal propagation between these two positions. Interestingly, through the use of DCI we can capture asymmetry between different residues within a protein, as coupling in and of itself is asymmetric within an anisotropic network. That is, each amino acid has a set of positions to which it is highly coupled, and this anisotropy in connections gives rise to unique differences in coupling between a given i j pair of amino acids which do not have direct interactions ( Figure 3). DCIasym, then, is simply DCIij (the normalized displacement response of position j upon a perturbation to position i) − DCIji (Equation 6).Using DCIasym we can determine a cause-effect relationship between the i j pair in terms of force/signal propagation between these two positions.

DCI
= DCI DCI (6) %DCI = %DCI %DCI (7) Entropy 2020, 22, x FOR PEER REVIEW 7 of 18 Figure 3. Example of asymmetric coupling between residue R68 of the PPIase domain and A31 of the WW domain in Pin1 (PDB ID 3TCZ [7]). The differences in local contacts give rise to network inhomogeneities which subsequently result in different %DCI values from R68 to A31 versus A31 to R68 (left). The subtraction of these two values gives a measure of coupling directionality upon external perturbations between these two sites (right). These figures were rendered in PyMol [40].

Epistasis and EpiScore
To investigate the relationship between internal networking and epistasis, we first apply our analysis to protein G domain B1 (GB1, PDB ID 2QMT [42]), for which there exists a comprehensive set of mutational data. Specifically, fitness effects of mutations were determined with high confidence for 1045 single mutants and 509,963 double mutants, with data available for all 1485 possible position pairs [24]. In this work, experimental epistasis was calculated as ln(Wab) − ln(Wa) − ln(Wb), where Wab represents the fitness for the dual mutant and Wa and Wb are the fitness values for the single mutants. Here we investigate the relationship between the experimental epistasis and EpiScore by comparing the average EpiScore for each position pair with instances of positive (blue) or negative (red) epistasis using the skewness of the experimental epistasis distribution over the full mutational space available for a given pair ( Figure 4). Skewness was chosen as it more accurately  [7]). The differences in local contacts give rise to network inhomogeneities which subsequently result in different %DCI values from R68 to A31 versus A31 to R68 (left). The subtraction of these two values gives a measure of coupling directionality upon external perturbations between these two sites (right). These figures were rendered in PyMol [40].

Epistasis and EpiScore
To investigate the relationship between internal networking and epistasis, we first apply our analysis to protein G domain B1 (GB1, PDB ID 2QMT [42]), for which there exists a comprehensive set of mutational data. Specifically, fitness effects of mutations were determined with high confidence for 1045 single mutants and 509,963 double mutants, with data available for all 1485 possible position pairs [24]. In this work, experimental epistasis was calculated as ln(W ab ) − ln(W a ) − ln(W b ), where W ab represents the fitness for the dual mutant and W a and W b are the fitness values for the single mutants. Here we investigate the relationship between the experimental epistasis and EpiScore by comparing the average EpiScore for each position pair with instances of positive (blue) or negative (red) epistasis using the skewness of the experimental epistasis distribution over the full mutational space available for a given pair (Figure 4). Skewness was chosen as it more accurately represented the substitution behavior than position averages, which would often tend towards zero without capturing the substitution behavior for a given position pair. EpiScore values were calculated for all position pairs relative to every other position within the protein and averaged over, generating one average EpiScore value for each pair. Interestingly, when we obtained the average EpiScore distribution of experimental positive and negative epistatic pairs we found that EpiScore values above and below one tend to distinctly divide positive from negative epistasis; positive experimental epistasis was more frequently skewed towards EpiScore > 1, and likewise negative cases are skewed towards EpiScore < 1.  . EpiScore values above and below one (dashed line) tend to distinctly divide cases for which experimental epistasis was more frequently skewed towards the positive (below one) and negative (above one)).
The full system analysis of GB1 showed the existence of a general trend between epistasis and EpiScore; particularly, an inverse relationship between EpiScore above or below one and skewness in experimental epistasis, indicating that positions with EpiScore less than 1 more often work cooperatively towards beneficial protein function, whereas pairs yielding EpiScore values greater than 1 usually result in antagonistic interactions which impair function. In an effort to elucidate more specific mechanistic details or trends underlying epistatic interactions which may exist in other systems, we broaden our application of EpiScore to other known epistatic proteins with a focus on specific mutation pairs. As such, we next study DHFR, a protein involved in the development of anti-malarial resistances in malarial parasites. Anti-malarial drugs commonly target the DHFR, which catalyzes the reduction of dihydrofolate and is essential to cellular growth and proliferation. Pyrimethamine is one such drug, used to treat malaria caused by one of the most common malarial parasites, Plasmodium falciparum, by competitively inhibiting DHFR. While exhibiting a particularly low sequence conservation between species, most differences in sequence are from flexible loop regions [43], while the secondary structures between these loops are highly conserved across all species [44]. However, widespread use of pyrimethamine has resulted in a prevalence of pyrimethamine-resistant P. falciparum DHFR (pfDHFR) mutants as a result of four key amino acid substitutions at positions N51, C59, S108 and I164 which have also exhibited significant epistasis between mutation combinations [22] ( Figure 5A). . EpiScore values above and below one (dashed line) tend to distinctly divide cases for which experimental epistasis was more frequently skewed towards the positive (below one) and negative (above one)).
The full system analysis of GB1 showed the existence of a general trend between epistasis and EpiScore; particularly, an inverse relationship between EpiScore above or below one and skewness in experimental epistasis, indicating that positions with EpiScore less than 1 more often work cooperatively towards beneficial protein function, whereas pairs yielding EpiScore values greater than 1 usually result in antagonistic interactions which impair function. In an effort to elucidate more specific mechanistic details or trends underlying epistatic interactions which may exist in other systems, we broaden our application of EpiScore to other known epistatic proteins with a focus on specific mutation pairs. As such, we next study DHFR, a protein involved in the development of anti-malarial resistances in malarial parasites. Anti-malarial drugs commonly target the DHFR, which catalyzes the reduction of dihydrofolate and is essential to cellular growth and proliferation. Pyrimethamine is one such drug, used to treat malaria caused by one of the most common malarial parasites, Plasmodium falciparum, by competitively inhibiting DHFR. While exhibiting a particularly low sequence conservation between species, most differences in sequence are from flexible loop regions [43], while the secondary structures between these loops are highly conserved across all species [44]. However, widespread use of pyrimethamine has resulted in a prevalence of pyrimethamine-resistant P. falciparum DHFR (pfDHFR) mutants as a result of four key amino acid substitutions at positions N51, C59, S108 and I164 which have also exhibited significant epistasis between mutation combinations [22] ( Figure 5A). An EpiScore analysis applied collectively to the behavior of the functionally important FG loop shows an immediate relationship between epistasis in pyrimethamine IC 50 values of the pairwise mutants and their associated EpiScore values. Figure 5B shows the EpiScore violin plots (i.e., distributions and kernel density estimates) with respect to FG loop residues 196-206 for each pfDHFR mutant pair. These violin plots show that EpiScore distribution for different mutation pairs yields a different distribution for different residue pairs. S108-I164 gives a narrow distribution with a peak around 1, suggesting that force perturbations simultaneously exerted on these positions yields the same fluctuation response profile of the FG loop positions as the average of individual fluctuation responses of the FG loop when the forces are exerted individually at S108 and I164. This distribution pattern was also observed for positions N51 and C59, although at completely different positions within the protein. On the other hand, pairing the position I164 with C59 rather than with S108 results in a completely different EpiScore distribution, with diverse fluctuation responses of FG loop positions. This suggests that I164 and C59 are highly cooperative, leading to a non-additive behavior when these two positions are perturbed simultaneously. As I164 and C59 are located at different regions of the protein (Figure 5A), one can expect to observe a wide range of EpiScore values associated with this pair. This pattern tends to hold with distally located positions in the N51-S108 and C59-S108 distributions as well. Interestingly however, N51-C59 also exhibits EpiScore values less than 1, despite the fact that they belong to the same helical region. The distributions suggest that anisotropy in the network of interactions could modulate a wide range of fluctuation responses via these position pairs, which result in different functional behavior upon mutation. To determine whether the change in fluctuation response of the FG loop to simultaneous perturbations at these mutational positions can capture functional substitution outcomes, we next investigate the relationship between EpiScore and experimentally measured epistasis using pfDHFR pyrimethamine IC 50 values. Figure 5C presents the average EpiScore values with respect to the FG loop for each pfDHFR pairwise mutant, in order of increasing pyrimethamine IC 50 epistasis. A dashed line at an EpiScore value of 1.0 has been added to aid in visual inspection. Here, IC 50 epistasis is reported as the IC 50 ratio of the dual mutant to the IC 50 sum of the individual mutants. Any FG loop residue which was within 10 angstroms of either mutation site per dual mutant was excluded from the averaging in order to eliminate any strong dynamic coupling effects that arise as a result of direct contact interactions. The average EpiScore values have a strong, negative correlation (R = −0.77) with IC 50 epistasis, where the stronger the positive epistasis, the lower the average EpiScore value. For example, an EpiScore value of~0 means the pairwise dynamic coupling to FG loop positions of a dual mutant pair is negligible as compared to the average individual dynamic coupling; that is, the distal sites can individually impact position the FG loop residues allosterically. However, when treated together with a simultaneous perturbation at both sites, the displacement response of the FG loop residues are significantly lower, and, subsequently, their joint ability to allosterically regulate these FG loop positions is effectively lost. Due to the interaction network between the two distal positions with the FG loop, they may antagonistically compensate the amplitude and direction of the response when the perturbations on these two sites are exerted at the same time. To the reverse, an EpiScore value >> 1 suggests that, simultaneously, two positions may exhibit dynamic coupling to the FG loop enough such that their pairwise mutational impact fundamentally alters the role the FG loop plays within the pfDHFR interaction network resulting in loss of function.
At first, this relationship may seem somewhat counterintuitive, as one could reasonably expect that the higher the EpiScore value (i.e., the stronger the dual position dynamic coupling versus individually averaged dynamic coupling), the higher the experimental epistasis. However, when complexed with substrate, the functionally critical M20 loop [45] is stabilized in part through interactions with amino acids in the FG loop [46]. It is possible that it is more favorable, in terms of pyrimethamine resistance, to have mutations occur at position pairs that induce a smaller fluctuation response of FG loop when perturbed simultaneously, (i.e., restricting the dynamics) than the average fluctuation response of individual perturbations applied one at a time. This is in agreement with previous work which showed that point mutations to two of the FG loop amino acids in E.coli resulted in a > 30 fold decrease in the steady state hydride transfer rate constant as compared to the wild-type [47]. This could additionally explain the pervasive and persistent nature of these mutations appearing globally in pfDHFR proteins. An EpiScore analysis applied collectively to the behavior of the functionally important FG loop shows an immediate relationship between epistasis in pyrimethamine IC50 values of the pairwise mutants and their associated EpiScore values. Figure 5B shows the EpiScore violin plots (i.e., distributions and kernel density estimates) with respect to FG loop residues 196-206 for each pfDHFR mutant pair. These violin plots show that EpiScore distribution for different mutation pairs yields a different distribution for different residue pairs. S108-I164 gives a narrow distribution with a peak around 1, suggesting that force perturbations simultaneously exerted on these positions yields the same fluctuation response profile of the FG loop positions as the average of individual fluctuation responses of the FG loop when the forces are exerted individually at S108 and I164. This distribution pattern was also observed for positions N51 and C59, although at completely different positions within the protein. On the other hand, pairing the position I164 with C59 rather than with S108 results in a completely different EpiScore distribution, with diverse fluctuation responses of FG loop positions. This suggests that I164 and C59 are highly cooperative, leading to a non-additive behavior when these two positions are perturbed simultaneously. As I164 and C59 are located at different regions of the protein ( Figure 5A), one can expect to observe a wide range of EpiScore Expanding our study to another system important to the concept of antibiotic resistance, we analyze TEM-1, a protein which possesses antibiotic resistance largely driven by its high evolvability, with over 170 TEM-1 mutants discovered as clinical or hospital isolates [49]. TEM-1 is a well-studied enzyme in experimental or laboratory-guided evolution, in an effort to both understand the mechanisms associated with its antibiotic resistance as well as predict possible resistance-conferring mutations [49][50][51][52].
Previous work has shown that the majority of the resistance-conferring mutations in TEM-1 are both distal to (10 Å or further) and highly coupled with the active site residues [53], indicating that these mutations impact TEM-1 function by allosterically regulating active site behavior. Additionally, it is now also understood that mutations resulting in the emergence of new enzymatic function are generally destabilizing which suggests that the evolution of new function requires additional, stabilizing mutations. As such, a more complete understanding of TEM-1 mutational behavior requires an investigation into the epistatic interplay of point mutation combinations. Thus, it is an ideal system for exploration of long-range dynamic communication to understand epistatic relationships in the emergence of resistance.
Here we focus on the specific epistatic relationship between four TEM-1 mutation sites (42, 104, 182 and 238) which have exhibited significant non-additive behavior [23]. Treating point mutations as external perturbative forces to the internal network of a protein, we apply EpiScore analysis to the main TEM-1 active site, residue S70, using a TEM-1 3-D structure obtained by an energy-minimized and equilibrated version of PDB ID 1BTL [53] with mutation sites shown as blue spheres in Figure 6A, along with active site S70 in red and alternative control sites (43,105,181 and 237) in yellow. Figure 6B (left) shows a plot of EpiScore versus experimental epistasis using cefotaxime turnover rates and exhibits a relationship similar to that found in pfDHFR, with a strong negative correlation of R = −0.71. We also find that position pairs with EpiScore values > 1.0 (horizontal dashed line), presenting a stronger pairwise dynamic coupling with position S70 compared to the average of the individual dynamic coupling, also corresponds to two of the three TEM-1 dual mutants with negative epistatic turnover rates (separated by vertical dashed line). Position pair 182-238 represents a deviation from this behavior, and position pair 42-104 is a comparative outlier to the overall correlation. The deviation of position pair 182-238 may be related to specific catalytic site interactions associated with position 238, the only position in which mutation resulted in an increase in turnover rate across all eight possible combinations of TEM-1 background. Interestingly, position 182, present in all position pairs with the three highest EpiScore values, was also the position in which mutation resulted in a significantly beneficial effect in the fewest number of possible backgrounds [23]. As a control, we also conducted this analysis using the alternative sites representing positions immediately adjacent to the four mutation sites ( Figure 6B (right)). These positions result in a significantly worse correlation with cefotaxime turnover rate epistasis than the mutation positions (R = −0.45 as compared to R = −0.71), showing the sensitivity in the EpiScore metric to specific positions, regardless of separation distance. behavior requires an investigation into the epistatic interplay of point mutation combinations. Thus, it is an ideal system for exploration of long-range dynamic communication to understand epistatic relationships in the emergence of resistance.
Here we focus on the specific epistatic relationship between four TEM-1 mutation sites (42, 104, 182 and 238) which have exhibited significant non-additive behavior [23]. Treating point mutations as external perturbative forces to the internal network of a protein, we apply EpiScore analysis to the main TEM-1 active site, residue S70, using a TEM-1 3-D structure obtained by an energy-minimized and equilibrated version of PDB ID 1BTL [53] with mutation sites shown as blue spheres in Figure  6A, along with active site S70 in red and alternative control sites (43,105,181 and 237) in yellow. Figure 6B (left) shows a plot of EpiScore versus experimental epistasis using cefotaxime turnover rates and exhibits a relationship similar to that found in pfDHFR, with a strong negative correlation of R=-.71. We also find that position pairs with EpiScore values > 1.0 (horizontal dashed line), presenting a stronger pairwise dynamic coupling with position S70 compared to the average of the individual dynamic coupling, also corresponds to two of the three TEM-1 dual mutants with negative epistatic turnover rates (separated by vertical dashed line). Position pair 182-238 represents a deviation from this behavior, and position pair 42-104 is a comparative outlier to the overall correlation. The deviation of position pair 182-238 may be related to specific catalytic site interactions associated with position 238, the only position in which mutation resulted in an increase in turnover rate across all eight possible combinations of TEM-1 background. Interestingly, position 182, present in all position pairs with the three highest EpiScore values, was also the position in which mutation resulted in a significantly beneficial effect in the fewest number of possible backgrounds [23]. As a control, we also conducted this analysis using the alternative sites representing positions immediately adjacent to the four mutation sites ( Figure 6B (right)). These positions result in a significantly worse correlation with cefotaxime turnover rate epistasis than the mutation positions (R = −0.45 as compared to R = −0.71), showing the sensitivity in the EpiScore metric to specific positions, regardless of separation distance.

Asymmetry and Epistasis
In TEM-1, mutational sites which confer incremental changes in biophysical activity are neither locally distributed with respect to one another, nor at important functional sites. Furthermore, they do not belong to an immediately identifiable allosteric inhibitor site, but they do, however, exhibit unique pairwise epistatic behavior which indicates that they likely regulate the active sites allosterically.
In an effort to analyze whether the pairs having EpiScore less than 1 and associated with positive epistasis (a beneficial, cooperative interaction) exhibit long-range communication that is distinct from the pairs having EpiScore greater than 1 and associated with negative epistasis, we explored the degree of asymmetry in long-range communication between the mutational positions and the active site positions using DCI asym . Thus, we calculated DCI asym between each TEM-1 dual-mutant site and all main active sites for the relevant TEM-1 structure (70, 73, 130, 166, 234, Figure 7), excluding the outliers 182-238 and 42-104 from Figure 6B. Here, positive %DCI asym values indicate activesite-dominant dynamic coupling, where mutational sites exhibit higher fluctuation response when the active site is perturbed. On the other hand, negative %DCI asym values indicate mutation-dominant dynamic coupling where perturbations at those positions controls the active site fluctuation response. Interestingly, we observe a relationship that provides some mechanical insight relating the degree of asymmetry to EpiScore; the dual mutants with EpiScore > 1 to active site S70 and epistasis < 1 had more instances of mutational-dominant coupling asymmetry, while the reverse was true for two of the three dual mutants with EpiScore < 1 and epistasis > 1 (position pair 42-104, an outlier in Figure 6B, does not hold to this pattern). This suggests that the epistatic effects captured through EpiScore to active site S70 may be compensated via coupling asymmetry to all active sites, with dynamic modification of the system ultimately including both effects. A position pair that more strongly affects active site S70 via EpiScore also possesses active site-dominant coupling asymmetry and vice versa. Taken together with Figure 6B, these data indicate that dual mutants which confer less disruption to important active sites (indicated by EpiScore < 1) than their averaged individual constituents, and those which are under active site regulation, (indicated by positive %DCI asym ) are those which display the largest degree of positive epistasis.
Thus, as a test system, TEM-1 highlights the complex relationship between mutational positions, allosteric relationships, and epistatic interplay. These emergent properties of the anisotropic residue-residue interaction network within a protein must be accounted for when attempting to fully understand or predict mutation outcomes.

Unidirectional Communication through DCI asym Creates Cause-Effect Relationships in Allosteric Regulations
Using the dynamical picture presented above, the modulation of protein dynamics through mutations (i.e., the fluctuation response to node perturbations within a network) is similar to the modulation of dynamics through binding; this is the fundamental principle behind the concept of dynamic allostery. With the TEM-1 dual mutation positions showing unique coupling asymmetry to the active sites, it follows that there should be an obvious, unidirectional signature between allosteric sites and active sites in known allosteric proteins. Here we explore the role dynamic coupling directionality plays in allosteric regulations using an ideal model system, Pin1. Pin1 is a two-domain protein containing a catalytic PPIase domain and a distally-located WW domain, connected by a flexible (and highly disordered) interdomain linker [54][55][56]. While strictly regulated in both function and expression within healthy biological tissue [57], the up-regulation and down-regulation of Pin1 is associated with several forms of cancer and Alzheimer's disease, respectively [57][58][59][60][61]. Studies have shown that the activity of the PPIase domain is enhanced when a ligand is bound at the non-catalytic WW domain [62,63] and communication between these two domains is requisite for proper biological function [55,[64][65][66]. Previous works propose the existence of communication networks between the WW domain and the PPIase domain, including a unique allosteric pathway which only becomes active when a substrate is bound to the WW domain [62]. A further computational study indicated that pathways of communication via force propagation from the PPIase domain to the WW domain changed when a ligand was WW domain-bound [67].
This suggests a cause-and-effect relationship exists between the two domains. Using this framework, a ligand binding event is modeled as a force perturbation to the binding positions in each domain. Upon these random force perturbations, we find that, overall, the WW domain is able to induce a stronger perturbation response in the PPIase domain than the reverse. This is largely the expected relationship between an allosteric site and a catalytic site; communication between these sites should predominantly involve information transfer from the allosteric site to the catalytic site,  Figure 6B. The first dual mutant (104-182, top left) has arrows drawn to indicate the coupling asymmetry, where red is active site-dominant and blue is mutation site-dominant. Both dual mutants with negative epistasis in turnover rate and EpiScore to position S70 < 1 also had %DCI asym distributions which were, overall, mutation site-dominant and, conversely, those with positive epistasis in turnover rate and EpiScore to position S70 > 1 exhibited active site-dominant %DCI asym .
Previous works propose the existence of communication networks between the WW domain and the PPIase domain, including a unique allosteric pathway which only becomes active when a substrate is bound to the WW domain [62]. A further computational study indicated that pathways of communication via force propagation from the PPIase domain to the WW domain changed when a ligand was WW domain-bound [67].
indicating that %DCIasym can capture communication directionality in allosteric systems from structural dynamics encoded within a given set of atomic coordinates.

Conclusions
In this work we showed how the anisotropic interaction network within a protein captures two essential emergent properties of protein evolution-epistasis and communication directionality-using the information stored in structural dynamics alone. Additionally, EpiScore can capture the behavior of dual-mutation epistatic outcomes with some consistent trends across different protein systems. As seen in pfDHFR, mutation pairs with a lower pairwise dynamic coupling versus average of individual couplings (EpiScore < 1) to FG loop positions are favorable, as dual mutations at these positions may be less likely to disrupt the FG loop's interaction with the functionally critical M20 loop. A similar trend was also observed in the EpiScore analysis of TEM-1 dual mutants, where lower EpiScore to active site S70 was generally associated with higher positive experimental epistasis (R = −0.71) Further, the system-wide EpiScore analysis of GB1 dual mutants has shown that the position pairs with average EpiScore values > 1 were associated more frequently with negative epistasis, indicating that these positions might ultimately be more disruptive to the entire protein when mutated together. Furthermore, when dynamic coupling asymmetry analysis was applied via %DCIasym to TEM-1, we found that EpiScore and epistasis both relate to dynamic coupling asymmetry, where position pairs which exhibited high EpiScores associated with negative epistasis also exhibited mutation-dominant coupling asymmetry. This suggests that %DCIasym and EpiScore may both capture factors which contribute towards the biochemical outcome of dual mutations. If both mutational sites dominate the dynamics coupling with the active site (i.e., the active site responds more to mutational site perturbations), then dual mutations on both sites lead to negative epistasis. This suggests a cause-and-effect relationship exists between the two domains. Using this framework, a ligand binding event is modeled as a force perturbation to the binding positions in each domain. Upon these random force perturbations, we find that, overall, the WW domain is able to induce a stronger perturbation response in the PPIase domain than the reverse. This is largely the expected relationship between an allosteric site and a catalytic site; communication between these sites should predominantly involve information transfer from the allosteric site to the catalytic site, indicating that %DCI asym can capture communication directionality in allosteric systems from structural dynamics encoded within a given set of atomic coordinates.

Conclusions
In this work we showed how the anisotropic interaction network within a protein captures two essential emergent properties of protein evolution-epistasis and communication directionality-using the information stored in structural dynamics alone. Additionally, EpiScore can capture the behavior of dual-mutation epistatic outcomes with some consistent trends across different protein systems. As seen in pfDHFR, mutation pairs with a lower pairwise dynamic coupling versus average of individual couplings (EpiScore < 1) to FG loop positions are favorable, as dual mutations at these positions may be less likely to disrupt the FG loop's interaction with the functionally critical M20 loop. A similar trend was also observed in the EpiScore analysis of TEM-1 dual mutants, where lower EpiScore to active site S70 was generally associated with higher positive experimental epistasis (R = −0.71) Further, the system-wide EpiScore analysis of GB1 dual mutants has shown that the position pairs with average EpiScore values > 1 were associated more frequently with negative epistasis, indicating that these positions might ultimately be more disruptive to the entire protein when mutated together. Furthermore, when dynamic coupling asymmetry analysis was applied via %DCI asym to TEM-1, we found that EpiScore and epistasis both relate to dynamic coupling asymmetry, where position pairs which exhibited high EpiScores associated with negative epistasis also exhibited mutation-dominant coupling asymmetry. This suggests that %DCI asym and EpiScore may both capture factors which contribute towards the biochemical outcome of dual mutations. If both mutational sites dominate the dynamics coupling with the active site (i.e., the active site responds more to mutational site perturbations), then dual mutations on both sites lead to negative epistasis.
As modulation of normal modes and protein dynamics is not only a tool used in evolution but also a principle exploited via allostery, we used an "ideal" allosteric system, Pin1, and observed the dynamic coupling asymmetry between a well-identified allosteric domain and an enzymatically active domain exhibits behavior that, as expected, showed the allosteric WW domain to dominate communication to the PPIase domain. Overall, these two novel protein dynamics-based metrics provide steps to mechanistically describe these complicated interactions, and also shed light on the complex anisotropic interaction network which ultimately gives rise to epistasis and allosteric regulation. They can be useful to predict mutational outcomes, particularly for those sites distal from the active site that can modulate function [68].

Conflicts of Interest:
The authors declare no conflict of interest.