The Glucocorticoid Receptor’s tau1c Activation Domain 35 Years on—Making Order out of Disorder

: Almost exactly 35 years after starting to work with the human glucocorticoid receptor (hGR), it is interesting for me to re-evaluate the data and results obtained in the 1980s–1990s with the benefit of current knowledge. What was understood then and how can modern perspectives increase that understanding? The hGR’s tau1c activation domain that we delineated was an enigmatic protein domain. It was apparently devoid of secondary and tertiary protein structures but nonetheless maintained gene activation activity in the absence of other hGR domains, not only in human cells but also in yeast, which is evolutionarily very divergent from humans and which does not contain hGR or other nuclear receptors. We now know that the basic machinery of cells is much more conserved across evolution than was previously thought, so the hGR’s tau1c domain was able to utilise transcription machinery components that were conserved between humans and yeast. Further, we can now see that structure–function aspects of the tau1c domain conform to a general mechanistic framework, such as the acidic exposure model, that has been proposed for many activation domains. As for many transcription factor activation domains, it is now clear that tau1c activity requires regions of transient secondary structure. We now know that there is a tendency for positive Darwinian selection to target intrinsically disordered protein domains. It will be interesting to study the distribution and nature of the many single nucleotide variants of the hGR in this respect.


Introduction
Since this issue of Receptors is dedicated to Jan-Åke Gustafsson, it is appropriate to look back to 1988, the year I joined Jan-Åke's group to study structure-function aspects of the human glucocorticoid receptor (hGR).Much of this work, and more, has been comprehensively reviewed elsewhere in this issue [1].Equipped with current knowledge and understanding, I am reminded that the interpretation of measured empirical data is a changing entity that is at any time limited by the overall state of knowledge at that time.As frontiers of knowledge move forward, new understanding arising from older results becomes possible.It is with this consideration in mind that this perspectives article is written.
My introduction to the hGR came a couple of years after cloning of its cDNA [2].Recent work on the purified rat glucocorticoid receptor protein had shown that the Nterminus containing the tau1 activation domain was more sensitive to proteolysis than the C-terminal domains and that it was the main recognition site for the mono-clonal antibodies that had been developed, thus providing circumstantial evidence for an open, less compact protein conformation for the N-terminal part of the receptor compared to other parts [3].The C-terminal part of the receptor, containing the DNA-binding domain (DBD) and steroid-binding domain (SBD), was more highly conserved with other members of the nuclear receptor family and would later be the part of the protein for which tertiary structures could be determined [4][5][6][7].Taken together with the fact that very little was known about the role of transcription factor activation domains in gene activation, these observations contributed to the hGR N-terminal domain being regarded as somewhat enigmatic.

The Human Glucocorticoid Receptor Can Work in Yeast
A second enigmatic aspect of our work was that the hGR functioned in yeast even though yeasts do not contain any nuclear receptors [8][9][10].How could this be?The answer was not apparent until over a decade later when the genome sequences of yeast [11] and humans [12] could be compared, showing that many of the 30-40,000 human proteins manifested remarkable levels of homology to the 5000 or so yeast proteins.As more genomes were sequenced, it became apparent that very many protein families are conserved throughout eukaryotes and that a dominant mechanism of evolution appears to have occurred at the level of gene regulation, such that new functionality has arisen by differentially expressing different combinations of a relatively conserved group of protein designs.These conserved protein families include many components of the transcriptional machinery, and thus, yeast provided a simple model system that allowed us to map functional protein domains, select change-of-function mutants in the hGR, and identify functional partner proteins of the hGR [13][14][15][16].

Acidic and Hydrophobic Amino Acids Contribute to the hGR's tau1c Domain Function
Using our model system, we identified a minimal 58-amino-acid region, the tau1 core (tau1c) [13], which maintained most of the activity of the previously identified larger tau1 activation domain [17].Like the yeast Gal4p and viral VP16 activation domains, the tau1c domain was rich in acidic amino acids, and nuclear magnetic resonance (NMR) studies showed that like them, it lacked detectable secondary or tertiary structures at neutral pH [18].Mutagenesis experiments showed that while most acidic residues were not important individually, activity was progressively lost as larger numbers of acidic residues were neutralised [19].Thus, tau1c appeared to fulfil the criteria required to satisfy the acid blob or negative noodle hypothesis that was suggested at the time and whereby activators were thought to function by attracting important transcriptional co-factors through nonspecific electrostatic interactions [20].
However, the characterisation of mutations that were selected for decreased or increased activity suggested a more complex structure-function relationship based on the hydrophobicity of key residues, such that reduced activity mutants had reduced hydrophobicity, and in a few cases, increased hydrophobicity increased activity [15].These activity changes could be correlated with changes in the binding strength to partner proteins involved in gene activation [21].The importance of hydrophobic residues had been reported earlier for the VP16 activation domain [22].Notwithstanding the lack of detectable secondary structures in tau1c, these results could be rationalised in the context of hypothetical antipathic alpha helices, where key hydrophobic surfaces could be postulated to make interor intra-molecular interactions with other hydropathic surfaces in, for example, partner proteins bound by tau1c [15].
Retrospectively, it is interesting to note a recent genome-wide study of many human transcription factors that appears to establish the generality of the principles that we observed for the tau1c domain [23].These principles include the importance both of hydrophobic amino acid clusters and acidic residues.An added insight from that systematic study is that the acidic residues are proposed to function by an "acidic exposure" process that presents hydrophobic residues as surface-exposed clusters with the capacity to bind to transcriptional co-activator proteins.These exposed protein conformations include, but are not restricted to, amphipathic helices.A further insight of the study was that leucine residues appear to play a more important "hydrophobic" role that valine and isoleucine, which are often considered to have similar hydrophobicity properties to leucine.Furthermore, it was concluded that some scales of estimated residue hydrophobicity (e.g., the Wimley-White scale [24]) may provide better estimates in this context than others (e.g., the commonly used Kyte-Doolittle scale [25]).Consistently, the effect of mutations on the activity of tau1c is not as well correlated with their effect on hydrophobicity estimated by the Kyte-Doolittle method (Rho = 0.39) compared to a range of other methods, including the Wimley-White method (Rho = 0.59, Figure 1).Residue classes where the effects of mutations on activity and hydrophobicity are not as well correlated include acidic residues and leucine residues (Figure 1, red and cyan points).Substitutions of most acidic residues reduce activity irrespective of their effect on hydrophobicity, consistent with the suggested importance of their acidity in the proposed acidic exposure process required to present hydrophobic surfaces suitable for partner protein interactions.In contrast, the D196Y and E221F substitutions cause a substantial increase in both activity (to 281% and 288%, respectively) and hydrophobicity (by 2.7 and 3.9 units, respectively, using the Wimley-White scale), perhaps because these acidic residues are not required to mediate the acidic exposure process but would increase the extent of the presented hydrophobic surface in their mutated form (see [15]).Effect of mutations on activity (Log2 % wild type) Effect of mutations on hydrophobicity (mutant−wt) Figure 1.Extent of correlation between the effect of tau1c substitution mutations on gene activation activity and their effect on hydrophobicity using different scales to estimate the hydrophobicity of amino acid residues.Scatter plots showing the effect of 45 single residue substitution mutants on hydrophobicity (wild-type value subtracted from the mutant residue value) as a function of their effect on activity (mutant activity as a percentage of wild type).The logarithmic scale is used to improve data point separation.Points are coloured according to the identity of the substituted residue in the wild-type tau1c protein.Dotted lines divide the plot into quadrants based on X and Y values for the wild-type tau1c domain.Rho correlation coefficients were calculated using Spearman's rank correlation tests (two-sided), and p-values testing the null hypothesis that Rho = 0 are shown.Amino acid hydrophobicity scores for the different methods were taken from Table 2 in [26].Values for the effect of mutations on tau1c activity were taken from the previously published Supplementary Table 1 in [27].
Contrary to the proposed special status of leucine residues in relation to isoleucine and valine in activation domains [23], many substitutions of leucine and isoleucine (cyan and green in Figure 1) seem to show a similarly high level of correlation between their effects on activity and hydrophobicity.However, a subset of leucine substitutions shows disproportionally large activity reductions in relation to the corresponding predicted reductions in hydrophobicity.These residues include L236V and L194V, for which activity is reduced to 17% and 23%, respectively, but where the hydrophobicity reduction (Wimley-White scale) is only 0.8 units.Interestingly, this subset is restricted to substitutions of only two of four analysed leucine residues (L194 and L236) in tau1c, suggesting that the special significance that has recently been attributed to leucine residues in activation domains may be restricted to specific molecular contexts.

Proteins without Apparent Structure Can Mediate Biological Function
In 1988, some of the most inciteful studies on activation domains had been performed on the yeast Gal4p and viral VP16 activation domains.These were shown to be generally small modular domains that had been referred to as "acid blobs" or "negative noodles" as a result of their predicted lack of ordered structure and the preponderance of acidic amino acid residues in their primary sequences [20].Biophysical studies of recombinant Gal4p and VP16 activation domains confirmed the lack of ordered structure at neutral pH, although the Gal4p domain did adopt a β-sheet conformation under mildly acidic conditions [28,29].However, the conventional view at the time was that protein functionality was coupled to tertiary structure formation and thus the idea that an unstructured protein could mediate biological function was not broadly accepted.It would take the bioinformatic discovery, some 15 years later, of many genomic regions, potentially encoding low complexity proteins, for the concept to gain traction that protein functionality could be mediated by so-called intrinsically disordered regions (IDRs) of proteins [30].Understanding IDRs has since contributed to understanding of inter-protein interactions, membrane-less intra-cellular compartmentalisation involving phase transitions, and many other aspects of molecular biology [31,32].
As intimated above, our original studies showed no evidence of a structured conformation for the isolated tau1c domain in aqueous solution, but alpha helical conformation was detected in the presence of added tri-fluoro-ethanol (TFE), indicating a propensity of tau1c to form alpha helices under some conditions [18].It is, however, unclear whether TFE conditions in vitro are relevant for physiological contexts in vivo.Importantly, similar effects were observed under more physiological in vitro conditions for larger regions of the hGR containing the tau1c domain [33,34].Therefore, in light of the current knowledge about IDRs, we recently revisited the issue of structured conformation in the tau1c domain [35].Analysis of chemical shifts in newly collected NMR data for the tau1c domain in aqueous buffer showed clear evidence of transient alpha-helical conformation in three segments that roughly corresponded to regions identified previously in the presence of TFE [18].The transient alpha-helical regions have secondary structure propensity (SSP) values around 0.2-0.3,suggesting a dynamic ensemble of conformational forms with 20-30% helical conformation.The existence of transient secondary structure elements has been shown to be a general feature in many of the transcription factor activation domains that have been studied by NMR and which are otherwise considered to be intrinsically disordered [36].
To evaluate the possible relationship between the level of intrinsic protein disorder and tau1c activity, we recently re-evaluated the set of 60 change-of-function mutations affecting tau1c amino acids from our early work [15] in relation to whether their predicted effect on intrinsic disorder correlates to their previously measured effect on gene activation activity [27].Several software were used to predict the effects of the mutations on intrinsic disorder, peptide-backbone stiffness and protein interaction propensity.Table 1 shows that there is a significant moderate negative correlation between the effects of mutations on tau1c activity and their effects on predicted disorder.In line with this, the effects of mutations on predicted stiffness of the tau1c peptide backbone and protein interaction propensity are positively correlated with their effect on activity.Consistently, molecular dynamics simulations predicted correlation between the effect of many mutations on the stability of a transient alpha-helical region and gene activation activity [27].Thus, for the tau1 domain and perhaps most other activation domains, there is a correlation between ordered conformation and the level of their activity, perhaps due to the role of transient secondary structure elements in mediating interactions with partner proteins.
Table 1.Correlation between the effect of tau1c change-of-activity mutations on gene activation activity and their effect on predicted intrinsic disorder, peptide backbone rigidity and protein interaction propensity.  1.derived from Spearman's rank correlation analysis, two-sided; 11 .null hypothesis-Rho = 0.

Coupled Binding and Folding of Activation Domains
It has become apparent that a major context for the formation of an ordered conformation in activation domains is the context of their interaction with partner proteins, a process referred to as coupled binding and folding [46].We provided early evidence for this by showing the formation of helical conformation upon binding of the intrinsically disordered Myc activation domain to the TATA-binding protein (TBP) [47].As for the tau1c domain, the cMyc activation domain has been shown to contain transient secondary structure elements and these are stabilised upon the binding of the domain to partner proteins [48].
To understand the mechanisms involved in the coupled binding and folding of the cMyc activation domain to TBP, we performed kinetic experiments [49].The interaction takes place in two distinct temporal steps.The first step is rapid and sensitive to salt concentration, indicating the importance of ionic interactions.The second step is slower with the rate being correlated with temperature, thus suggesting an enthalpy-driven step involving protein folding.We later showed similar results using the VP16 and Gal4 activation domains binding to TBP, as well as the Swi1 and Snf5 subunits of the Swi/Snf chromatin remodelling complex [50].We were never able to complete analogous studies with the tau1 domain due to technical difficulties, but others have shown changes in protein conformation upon binding of partner proteins by the hGR N-terminal domain [51,52], suggesting that similar binding mechanisms may apply.From our kinetic studies, we surmised that acidic residues in activation domains likely play an important role in the first interaction step.It is of course possible that they also play an important role in the recently proposed process of acidic exposure [23], which presents clusters of hydrophobic residues in a conformation favouring partner protein interaction during the second step.The acidic exposure process would thus be expected to favour transition from the first binding step to the second, enthalpy-driven step.Kinetic studies have subsequently been conducted for a variety of protein interactions, where different transactivation domains adopt "induced fit" and "preformed conformation capture" interaction strategies to different extents [53,54].For the disordered N-terminus of the hGR, containing the tau1c domain, there is also evidence for allosteric regulation involving intra-molecular coupled binding and folding interactions [55].

Intrinsically Disordered Regions and Evolution
Our early work showed that missense mutations substituting amino acids in the tau1c region of the hGR could either increase or decrease the strength of the response to glucocorticoid signalling [15].Thus, the intrinsically disordered tau1c domain would be expected to represent a facile target for adaptive mutations that modulate the strength of glucocorticoid response during evolution.We considered that perhaps the conformational flexibility of IDRs could also impart an intrinsic propensity for evolutionary adaptation because their spatial and dynamic flexibility could facilitate the ability to adopt new conformational forms, leading to new functionality.Interestingly, we later showed that yeast transcription factors, which contain a high content of IDRs, appear to have been preferred targets for adaptation processes during evolution [56].Furthermore, in a genomewide study of fully sequenced genomes from 64 wild and domesticated yeast strains (Saccharomyces cerevisiae or the very closely related Saccharomyces paradoxus), we showed that codon sites predicted to be under positive Darwinian selection were about three times more likely to occur in regions encoding predicted IDRs than in codon sites in regions encoding predicted secondary structure elements of proteins [57].Similar findings have been reported in a broad range of other organism groups, including humans, as exemplified in [58][59][60][61].Thus, the intrinsic evolvability of IDRs in transcription factors may provide part of the explanation for why evolution throughout eukaryotes has in part been driven by increased regulatory complexity rather than, e.g., the advent of new protein classes.A large number of single nucleotide variants have been identified for the hGR, with many occurring in the disordered N-terminal domain containing the tau1c activation domain, but the selective/functional significance of these variants remains to be systematically evaluated [62].

Conclusions and Future Perspectives
The main technical developments that are relevant for the re-evaluation of our early work on the tau1c activation domain of the hGR are large-scale sequencing of DNA allowing for the sequencing of whole genomes, extended and refined spectroscopic methods for studying protein conformation, and the development of software for the analysis of genomes and proteomes, as well as protein sequences and conformations.As described in this perspectives article, these developments have allowed for (i) the rationalisation of how human proteins are able to function in yeast, (ii) the understanding of structure-function aspects of tau1c within a generalised framework based on systematic studies of many transcription factors and (iii) the potential for understanding hGR variants within the scope of a better general understanding of the relationship between functional adaptation/conservation and protein conformation characteristics.
The future will likely see the continued development of experimental and computerbased methods that will offer further understanding of IDRs, like the tau1c domain, in the context of the ensembles of conformational states that together define them.Such approaches will likely further the understanding of aspects such as the role of transient regions of secondary structure within IDRs, as well as the ability of individual IDRs to transition between different liquid or solid phases.Correlating the effects of natural or experimentally induced mutations on such biophysical aspects with their effects on biological function will continue to provide an important approach to link biophysical phenomena to protein functionality.