Protein molecules are nature’s architectural marvels. They are crafted to perform a defined function which gives them their biological relevance [1
]. In this study, we unraveled the evolutionary fingerprints in the structure of superoxide dismutase (SOD) with the aim to explore how the residues that are significant from an evolutionary perspective influence the structural integration of the protein. Superoxide dismutase is of immense importance from an evolutionary standpoint, since it had its origin at a significant geo-biological transition, the Great Oxidation Era. This geological era witnessed the rise and dominance of cyano-bacterial life forms which, in turn, was responsible for enriching the Earth’s atmosphere with molecular oxygen. The inundation of Earth’s atmosphere with molecular oxygen was coupled with the rise of oxygen-based metabolism and, hence, oxygen-based life forms [2
]. Molecular oxygen is extremely prone to erroneous reduction, resulting in the generation of superoxide free radicals. These free radicals are reactive and highly toxic to the cellular machinery and membrane integrity [3
]. Cells had to evolve ways to counteract the toxic impact of oxygen which led to the evolution of a highly orchestrated network of antioxidants in a defense system, where SOD is the primary antioxidant [2
]. Therefore, the evolution of oxygen-based life is integrally coupled with the co-evolution of an antioxidant defense system, and, more importantly, evolution of SOD [2
]. The rise of atmospheric oxygen was a critical event in biological evolution [5
]. Some crucial evolutionary events, such as the birth of eukaryotes and the explosion of animal diversity in the Cambrian era, have been linked to elevated atmospheric oxygen [6
]. Currently, solutions to questions related to planet oxygenation depend largely on geochemical methodologies [7
Superoxide dismutase is found to exist in various canonical isomeric states [5
]. The isomers differ in terms of the metal ion co-factors, although other functions remain identical. Human cells are armed with two canonical isoforms of SOD [5
]. The mitochondrial SOD has Mn as the co-factor, and its rise and evolution can be traced back to bacterial endosymbiosis [8
]. Cytosolic SOD, on the contrary, has Cu and Zn as the metal ion cofactors, and, hence, termed as Cu–Zn SOD or SOD1 [8
]. This metalloenzyme exists as a homodimer, where each monomer is composed of a β-barrel and seven loops. Out of these loops, loop IV and VII are extremely significant from structural and functional perspectives. Six out of the seven metal co-ordination sites are positioned into the aforementioned two loops. A redox-active Cu ion is attached to the protein by its interaction with the four histidine residues: His 46, His 48, His 63, and His 120. A Zn co-ordination pocket is formed by the interaction between Zn ion and residues His 63, His 71, His 80, and Asp 83 [11
]. Out of these seven metal binding residues, His 120 is located in the loop VII, His 48 is in the β-strand, and the remaining five residues are all positioned in loop IV. Importantly, these loop regions have been directly implicated in the misfolding and subsequent cytotoxic aggregation of SOD1, which lead to the fatal neurodegenerative implications and a diseased state termed amyotrophic lateral sclerosis (ALS) [11
]. More than 100 different mutations have been associated with ALS [12
]. Furthermore, the elucidation of combined sequential events of zinc acquisition, chaperone-mediated copper loading pathways, and functional activation through dimerization of SOD is integrated with the orchestration of loop regions [14
]. Therefore, a comprehensive study of these two loop regions would offer better insights into the internal dynamics of this protein and the relevance of these loops to SOD1 physiology and function.
Several recent studies have been performed to understand the residue patch of SOD1 responsible for ALS. Probable aggregate structures of those disease mutants and their impact were also studied in cells [17
]. But there is a lack of information on the effect of evolutionarily important residues on the emphasized disease mechanism. In this study, we unraveled the evolutionary fingerprints in the structure of SOD with the aim to explore how the regions that are significant from an evolutionary perspective influence the structural integrity of the protein. The evolutionary analysis involved understanding of the co-variation of amino acid residues in terms of mutual information (MI) and direct information (DI) by deploying direct coupling analysis (DCA) on SOD1 sequence space, comprising almost 4000 sequences from the SOD_Cu (PF00080) family. Our evolutionary coupling analysis identified the local regions of SOD1, where the bulk of the coupled pairs are lodged, reflecting the importance of discrete sub-structural areas in SOD1, which are under strong evolutionary selection.
As evidenced by the PONDR®
-based disorder analysis, the loop regions (loop IV and VII) of the metal-depleted apo-SOD1 had two stretches of extended disorder, intrinsically disordered domain I and II (IDD I: residue 49–82 and IDD II: residue 121–142) [19
]. On the contrary, the metalized wild-type SOD1 (WT SOD1) is very robust and possesses high thermal and chemical stability [13
]. Therefore, the metal co-ordination of SOD1 is integral to this protein’s transition from the low stability disordered state (which is crucial for interaction with metal ions) to very stable ordered state required for subsequent biological functions of this protein. We resorted to structure–network analysis to understand how the inner organization of the pair-wise linked amino acids change, impacting the internal dynamics of SOD1 as this protein transcends from its apo-state to metallated form. For this purpose, we used a number of protein variants which either have a disrupted metal binding site or a mutated loop stretch. Stretch mutants were generated by means of mutations that were reported to be involved in ALS (retrieved from UniProt). These mutations were reported to promote aggregate formation by different distinct mechanisms. Some missense mutations were reported to distort the Zn binding [20
] and some to decrease the metal ion coordination affinities that lead to the formation of aggregates [13
]. Some of the selected residues were reported to reduce the net charge of the protein molecule at pH 7.4 [13
]. We also used the fully metallated WT protein and completely metal-free apo-protein as two extreme controls. The variants selected for our study stand important as the metal micro-environment and the loop flexibility is intrinsically coupled.
Although there are multiple reports reflecting upon the importance of the electrostatic loops [21
] in SOD1 and the role of metal ions in terms of crafting the structural integrity in SOD1, there is a lack of understanding as to how these residues are interdependent and whether they are under evolutionary selection pressure. Deploying co-evolution analysis on a wide array of sequences, we have been able to retrieve residues which are co-evolving and, hence, evolutionarily coupled.
Our collective inferences drawn from sequence space and structural analysis suggest the critical importance of loop regions along with the residue specific contribution in deciding the global conformational fate of SOD1 and the concomitant transition from disorder to order upon sequential metal co-ordination.
From the MI and DI scores, probable sets of interacting partners (residues) were defined through the networks, where each node represented an amino acid residue and the edges revealed the interactions, i.e., the coupling among the residues. By exploring the top DI pairs, it was observed that most of the co-varying residues participating in coupling interactions were positioned mainly in loops IV and VII. Resorting to graph theoretical models, we could conclusively infer the importance of the loop regions in SOD. From the graph theoretical model, it becomes very apparent that, although the overall structural integrity of SOD1 is predominantly determined by its β-sheet structure, the unstructured loop segments also have a key contribution. Interestingly, many of these coupled co-varying residues have already been linked with SOD1 aggregation and ALS from the structural and physiological perspectives. A mutation in position 8 has been reported to have reduced enzymatic activity and has been isolated from ALS patients [43
]. There are reports of an H46R mutation in the Cu/Zn SOD gene which has been highly related to an unique subtype of familial ALS [44
] and non-native conformational changes leading to a gain of interaction among dimers further propagating to higher-order arrays [45
]. Further, there are reports which state a mutation at position 58 heavily impacts on Cu loading, owing to the impaired chaperone interaction [46
] and promotes fibrillar aggregate formation [47
]. It is interesting to note that residues around position 58, also deciphered in our co-evolution analysis, have been reported to be extremely critical in deciding the dimerization propensity, as the stretch involves residue associated with intra-subunit disulfide bond and an increased loop flexibility [48
]. Mutation at residue 68 has also been reported in clinical cases of ALS [49
]. Mutations at positions 118 and 125 have also been reported as novel exonic mutations in clinical cases of ALS [50
We performed structure network analysis to understand how global organization in SOD1 is dictated by loops IV and VII which, in turn, house the majority of these evolutionarily co-evolving residues. The apo form of SOD1 with an absence of metal-ion co-factors represents a completely opposite state compared to WT SOD1. Since SOD1 is complex system with heterogeneous secondary structural organizations and co-factor ions orchestrating the structural fine tunings, pathogenic forms of SOD1 show wide disparities in terms of protein stabilities. But most ALS-associated mutations have been reported to have the greatest impact on the immature form of SOD1 with destabilized metal free states [51
]. Betweenness centrality profiles in our study revealed that the metal pockets (i.e., Cu and Zn co-ordination sites) in the presence of metal ion co-factors exhibit structural rigidity. For other mutants, centrality values for this same stretch were found to be higher than the WT (Figure 6
). Interestingly, since all the other variants had disrupted metal co-ordination, they showed near equal centrality values for the aforementioned stretch. Difference in the centrality value between the WT protein and other variants strongly indicates the importance, as well as the influence, of those sites on the intrinsic dynamics of the protein. The apo and Zn-SOD1 shared almost a similar range of centrality throughout intrinsically disordered domains I and II (IDD I and IDD II). We provide a clear picture of how the internal dynamics of SOD1 gradually changes upon metal co-ordination. This can be directly correlated with the structural stability of SOD1 earlier reported. Earlier reports have stated how Cu and Zn co-ordinations stand extremely important in the context of structural integrity and preventing aggregation by, respectively, stabilizing the intra-subunit disulfide linkage and promoting the folding in the two disorder loops and, hence, creating the catalytic subunit and concomitant stabilization of the global structure [23
The change in the number of community clusters, as observed in our structure network analysis, from de-metallated to fully metallated states through the intermediate partially metallated states and other variants unraveled that the metal co-ordination sites, and their micro-environments are tightly constrained. The increment in the number of clusters in SOD1 under completely metallated conditions indicates smaller local arrangements resulting from Cu and Zn micro-environments (Figure 5
A). Here, the fluctuation in the residues near the metal co-ordination sites almost diminishes, having insignificant contribution to internal dynamics, evidenced by a negligible betweenness centrality value (Figure 6
A). The absence of metal ions disrupts these smaller local arrangements with a concomitant impact on the internal dynamics emanating from the residues making up the loop regions of SOD1, which otherwise crafts the metal micro-environments. Thus, the number of clusters decreases to eight in the case of apo SOD1 (Figure 5
D). By considering all these facts, residues in cluster 3, 5, 6, and 7 in the community cluster network of the WT protein were considered to be very much significant (Figure 7
In the absence of metal ions, either in the partial mono-metallated state or complete de-metallated state, the extended loops, due to the fact of their intrinsically disordered nature, support a continuum of conformational states and transitions. The binding of metal ion co-factors to an intrinsically unstructured protein complements disorder to order transition that is concomitant to an entropic cost [52
]. Here, thermodynamic stability is guided by favorable enthalpy contribution, which represents the enthalpy–entropy compensation [52
]. These are all internal events and remain synchronized with the metal co-ordination in SOD1. This renders a cryptic disorder in proteins like SOD1, where the metal ion cofactors upon entry conceals the local disorder and locks the loop region in its state of restricted mobility. Our evolutionary analysis pinpoints specific residues which are co-evolving and are hence extremely critical for SOD1’s biological relevance. Interestingly, mutations associated with many of these positions have already been associated to ALS. Further, our analysis reveals some novel sites which have not been associated with ALS earlier and yet are critical and, hence, co-evolving.