Illuminating Intrinsically Disordered Proteins with Integrative Structural Biology

Intense study of intrinsically disordered proteins (IDPs) did not begin in earnest until the late 1990s when a few groups, working independently, convinced the community that these ‘weird’ proteins could have important functions. Over the past two decades, it has become clear that IDPs play critical roles in a multitude of biological phenomena with prominent examples including coordination in signaling hubs, enabling gene regulation, and regulating ion channels, just to name a few. One contributing factor that delayed appreciation of IDP functional significance is the experimental difficulty in characterizing their dynamic conformations. The combined application of multiple methods, termed integrative structural biology, has emerged as an essential approach to understanding IDP phenomena. Here, we review some of the recent applications of the integrative structural biology philosophy to study IDPs.


Introduction
Intrinsically disordered proteins (IDPs) are presently a prime focus of the protein biochemistry research enterprise, but that was not always the case. Although IDPs represent around a third of all proteins in eukaryotes [1][2][3][4], they were not a fashionable topic for researchers until the late 1990s. The existence of natively unfolded or disordered segments within otherwise folded proteins was well known from the early days of X-ray crystallography where parts of proteins that were not part of structure solutions were presumed dynamic and flexible. Those regions were often excised to aid crystallization. A few examples of focused IDP study appear through the literature in the 1960s and 1970s [5]. The discrepancy of some protein mobilities in size-exclusion chromatography compared to well-folded standards was an early observation interpreted as being due to flexible, disordered proteins [6]. In the 1970s, NMR studies could reveal disordered conformations, for example in glucagon [7]. From the 1960s to the 1980s, components of ribosomes [5] and histones [8] were also considered to have flexibility or disorder.
Despite these studies discussing properties of IDPs, the idea that biological functions could derive directly from the disordered properties was generally not considered. Gradually, appreciation for functional impacts of flexible linkers between domains or disorder-to-order transitions accrued [9]. One notable example of ahead of its time thinking was Paul Sigler's musings where he synthesized several results about transcription factors in 1988 [10], resulting in a proposal of a key functional role for the disordered domains. Perhaps, it was not more widely adopted in part following his naming of the functional domain as an 'acid blob' or 'negative noodle', alluding to the role of the overall charge of the disordered domain in this proposed function. The failure of the broader field to seriously consider that functions could directly result from the nature of the disordered chain in IDPs has been suggested by several authors [11][12][13] to be a result of the then dominance of the lock-and-key view of enzyme/substrate functional interactions, reinforced by the stunning successes of X-ray crystallography to provide snapshots of carefully folded proteins stable 'active sites'. This bias, combined with the experimental difficulties in characterizing these disordered conformational ensembles, prevented earlier appreciation of the functional roles we now know IDPs have. Indeed, even today, IDPs are thought to comprise a significant fraction of the 'dark' proteome, the proteome that is genetically expected to exist but not yet observed and characterized [14].
In the 1990s, the proliferation of gene-based techniques to interrogate protein function, genomic sequencing, and bioinformatics advances, along with technical improvements in NMR, led to increased appreciation of the functional importance of IDPs. Uversky was among the first few scientists to discuss significant wide-spread functions for IDPs in ways that were highly influential and brought recognition of this potential to the wider field [15][16][17][18].
Some credit for establishing clear 'structure-function' paradigms arising from the disordered properties of IDPs must go to the practice of combining several distinct characterization approaches to draw conclusions, an approach termed integrative structural biology [34][35][36]. Integrative structural biology seeks to combine multiple characterization approaches with different sensitivities to provide a more complete understanding of biomolecular conformational ensembles and dynamics. The tendencies for IDPs to rapidly fluctuate while sampling wide ranges of conformation space rather than remaining in each state makes them well suited for applications of multiple experimental probes to reveal different aspects of their behaviors. Such integrative structural biology approaches are becoming more common. Impressively, Uversky anticipated the utility of multiple characterization methods to enhance the understanding of IDP functions. He amusingly illustrated the necessity of using integrative structural biology approaches for IDP studies with a parable about confusion when examining an elephant without the proper global perspective [37]. Here, we review some of the latest successes in combining methods through the integrative structural biology approach to characterize IDP conformations and address their myriad functions.

Summary of Methods
From a general experimental perspective, confirming that conclusions are consistent with multiple different experimental methods inspires increased confidence. For example, some methods require modification of molecules with extrinsic labels (fluorescence or EPR for example). Consistency with other measurement methods that do not use the modifications or use different modifications can confirm that such modifications do not affect the results in detrimental ways. In the integrative structural biology approach applied to IDPs, using different methods also has greater benefits because different methods have sensitivities to distinct length or time scales and even different concentration ranges ( Figure 1). IDPs have behaviors that span broad ranges in these properties, making the use of multiple methods almost essential. For one example, liquid-liquid phase separation (LLPS) phenomena occur for a number of IDPs where concentrations are a key controlling factor [38]. Before discussing applications of integrative structural biology approaches to IDPs (see Figure 1), we first briefly describe some of the key individual methods used to characterize IDPs. IDPs (see Figure 1), we first briefly describe some of the key individual methods used to characterize IDPs.

Nuclear Magnetic Resonance
Nuclear magnetic resonance (NMR) has been used to study ordered proteins with atomic resolution since the 1950s [39,40] and applied to IDPs for several decades [37,41]. NMR relies on the local environment of each nucleus to produce a unique chemical shift signal which provides information on the conformation and close surroundings [37,42,43]. IDPs do not have a stable local environment so NMR alone lacks the ability to characterize disordered regions. However, NMR methods such as paramagnetic relaxation enhancement (PRE), secondary chemical shift (SCS), residual dipolar couplings (RDCs), and others can be used to characterize the conformational dynamics of IDP structures. Although NMR is a powerful technique, it is important to note that it is not without its limitations. Long IDPs must be divided into smaller sequences, experiments are often conducted at low temperatures which can decrease some kinetic activity, generally need high concentrations, and tags should be removed before conducting experiments [44]. NMR provides averages of ensembles but is limited in full conformational distribution determination for IDPs. A similar method, electron paramagnetic resonance (EPR), requires attachment of extrinsic spin labels and can be used at low temperatures to probe individual states and collect information on distance distributions [43,45].

Scattering Methods
Small angle X-ray scattering (SAXS) and small angle neutron scattering (SANS) are other methods commonly used to study IDPs which provide information on a global scale compared to the local scale NMR offers. A SAXS or SANS scattering profile can differentiate between globular and disordered proteins and determine a protein's size and overall shape [46,47], although these interpretations are low-resolution, require high protein concentrations, and are dependent on model selection [48,49]. These methods are commonly

Nuclear Magnetic Resonance
Nuclear magnetic resonance (NMR) has been used to study ordered proteins with atomic resolution since the 1950s [39,40] and applied to IDPs for several decades [37,41]. NMR relies on the local environment of each nucleus to produce a unique chemical shift signal which provides information on the conformation and close surroundings [37,42,43]. IDPs do not have a stable local environment so NMR alone lacks the ability to characterize disordered regions. However, NMR methods such as paramagnetic relaxation enhancement (PRE), secondary chemical shift (SCS), residual dipolar couplings (RDCs), and others can be used to characterize the conformational dynamics of IDP structures. Although NMR is a powerful technique, it is important to note that it is not without its limitations. Long IDPs must be divided into smaller sequences, experiments are often conducted at low temperatures which can decrease some kinetic activity, generally need high concentrations, and tags should be removed before conducting experiments [44]. NMR provides averages of ensembles but is limited in full conformational distribution determination for IDPs. A similar method, electron paramagnetic resonance (EPR), requires attachment of extrinsic spin labels and can be used at low temperatures to probe individual states and collect information on distance distributions [43,45].

Scattering Methods
Small angle X-ray scattering (SAXS) and small angle neutron scattering (SANS) are other methods commonly used to study IDPs which provide information on a global scale compared to the local scale NMR offers. A SAXS or SANS scattering profile can differentiate between globular and disordered proteins and determine a protein's size and overall shape [46,47], although these interpretations are low-resolution, require high protein concentrations, and are dependent on model selection [48,49]. These methods are commonly used to characterize IDPs [47]. An IDP will react to changes in its environments that allow the protein to bind or unbind to other molecules present in the cell. By changing the experimental conditions to mimic these signals (pH, temperature, additives, etc.), the behavior of an IDP changes on a global scale, which SAXS and SNAS are well equipped to measure.

Label-Based Approaches
Fluorescence correlation spectroscopy (FCS) is an optical technique commonly used to study the diffusion of fluorescently labeled molecular ensembles by measuring the time correlation of fluorescent fluctuations in the detected signal [50,51]. FCS is minimally invasive and does not require high protein concentrations [52,53]. As a solitary method, it is a powerful tool for studying the interactions between an IDP and its associated molecules, such as the Alzheimer's related protein Tau and tubulin dimers [54]. Although FCS alone cannot reveal information about secondary protein structures, the conformational dynamics of a protein can be determined when it is combined with the results from other fluorescent methods [50]. Electron paramagnetic resonance (EPR) spectroscopy detects unpaired electrons and is commonly coupled with site-directed spin labeling (SDSL) to study a protein's folding and unfolding events, interaction sites, and side chain mobility [45,[55][56][57]. Although traditional labeling of a protein can change its conformational properties, the most common spin labels introduced via site-directed mutagenesis onto cysteine residues are relatively small, which decreases the risk deviating from wild-type behaviors [56,58]. SDSL-EPR spectroscopy is a sensitive and practical way to study the disorder-to-order transitions an IDP undergoes during binding events in near-native conditions [58,59]. This method is also capable of revealing IDPs or regions of an IDP that remain unstructured upon binding and complex formation [56]. Another EPR technique commonly used in the study of IDPs is Double Electron Electron Resonance (DEER), also called Pulsed Electron Double Resonance (PELDOR), which is well suited to determine spin site distances [45,60,61]. Because DEER requires spin-labeling, the distance measurements possess an inherent uncertainty due to potential (unintended) impacts on molecular conformation from the presence of the labels [45]. Multiparameter fluorescence detection (MFD) is an approach which collects fluorescent information such as intensity, lifetime, anisotropy, excitation and fluorescence spectra, and fluorescence quantum yield [62][63][64][65][66][67]. MFD is useful for improving the resolution of ensemble fluorescence experiments to reveal differences between similar sub-populations [65,67,68].

Single-Molecule Approaches
NMR, EPR (DEER) and SAXS are powerful methods that can be used to collect data about IDPs; however, the information provided is limited to the characteristics of an ensemble. Instead of averaging the properties of an ensemble, single molecule techniques can resolve dynamics and conformations of individual molecules [69][70][71]. Single molecule fluorescence (or Forster) resonance energy transfer (smFRET) is an optical spectroscopy approach to measuring the distance between two fluorophores of choice, but the fluorophore and position of labeling must be carefully considered to minimize the possibility of changing the dynamics of the protein. This is particularly useful in the study of IDPs because of the irregular folding dynamics of each protein as well as protein-protein interactions and protein aggregation [72][73][74][75]. smFRET has been applied to studies of many IDPs including the human proteins histone H1 and its partner nuclear protein prothymosin-alpha (ProTa), SNARE complexes such as syntaxin and SNAP-25, and Prostate-associated Gene 4 (PAGE4) [51,72,[76][77][78][79][80][81][82][83][84][85][86][87].

Atomic Force Microscopy
Electron microscopy (EM) was among some of the first methods used to obtain structural data for proteins; however, it has had limited use for studying IDPs [34]. An exciting new technique being applied to study IDPs is high-speed AFM [88,89]. High-speed atomic force microscopy (HS-AFM) is a method particularly suited for studying protein functions in near-native conditions with no labeling necessary. HS-AFM has the capability to observe IDPs transitioning between states of order and disorder and partial folding under certain conditions with a broader range of applicable length scales than FRET [90]. Interactions with surfaces that might shift energy landscapes and thus conformational ensembles is a concern, but the practice of this method is advancing rapidly.

Cryo-EM and X-ray Crystallography
Cryo-electron microscopy (cryo-EM) is a rapidly advancing technique that is gaining popularity as a method for protein structural analysis [91]. Before the development of commercially available direct electron detectors and data analysis software for cryo-EM, Xray crystallography was the method of choice to investigate protein structure [92]. However, X-ray crystallography does not provide insights on the properties of a disordered region due to its atomic flexibility, resulting in non-coherent X-rays. Instead, the lack of structural data, or missing electron density, is used to determine where disordered regions are located [37]. Unlike X-ray crystallography, cryo-EM does not require sample crystallization; instead, the proteins are frozen in a thin layer of solution [91,93,94]. Cryo-EM works well for proteins with large molecular weight and can survey multiple conformational states. However, similar to X-ray crystallography, cryo-EM only works with a moderate level of heterogeneity and regions of disorder are represented with poor resolution [92,93,95]. Time resolved measurements with both X-ray and cryo-EM methods for folded proteins are developing [96][97][98][99][100].

Solvent Accessibility Methods
Because solvent accessibility is associated with protein folding and stability, it can be a useful parameter when classifying and modeling an IDP [101].

Hydrogen-Deuterium-Exchange
Hydrogen-deuterium-exchange (HDex or HDX) measures differences in deuterium uptake that are reflected in the solvent accessibility of the protein under native conditions in solution [102]. Information gathered from HDX is useful for studying folding intermediates as well as protein dynamics as the protein performs its function [102][103][104]. IDPs can be difficult to study using HDX because of their flexibility, heterogeneity in solution, and fast deuteration times [102]. Lowering the pH of the solution decreases the exchange rate and provides reasonable experimental time windows for the study of IDPs using HDX. To avoid the affect that lowering the pH can have on a protein's structure and dynamics, pulse labeling HDX has been used to study IDPs [103][104][105].

Crosslinking Mass Spectrometry
Crosslinking mass spectrometry (XL-MS) uses a "bottom-up" approach that supplies information on interaction sites rather than the "top-down" approach of native MS which informs overall protein structure [106]. XL-MS can be used to study the interaction sites between proteins or within a single protein. A cross-linking reaction, which can be performed in the protein's native environment, covalently links nearby amino acids that react with the crosslinker of choice [104]. Another advantage of XL-MS is the low protein concentration required to perform experiments. Two residues often targeted in XL-MS are lysine and arginine which are frequently abundant in disordered regions or disordered proteins, causing XL-MS to gain popularity as a method for IDP studies [106][107][108][109]. However, studying dynamic proteins such as IDPs with XL-MS can be challenging because the results often reflect only a fraction of the conformations or residue distances of the ensemble [104].

Proteolysis
Proteolysis, the enzymatic digestion of a protein into amino acids, disproportionally affects unstructured sequence regions [110]. IDPs are digested more quickly than ordered proteins due to their flexibility and the accessibility of protease susceptible sequences [13,111,112]. The rates of digestion are quantified via SDS-PAGE or liquid chromatography mass spectroscopy (Figure 2), which can then be used to loosely determine degree of disorder [113].

Proteolysis
Proteolysis, the enzymatic digestion of a protein into amino acids, disproportionally affects unstructured sequence regions [110]. IDPs are digested more quickly than ordered proteins due to their flexibility and the accessibility of protease susceptible sequences [13,111,112]. The rates of digestion are quantified via SDS-PAGE or liquid chromatography mass spectroscopy (Figure 2), which can then be used to loosely determine degree of disorder [113].

Spectroscopies
Spectroscopy is an invaluable tool for probing and studying characteristics of IDPs.

Circular Dichroism
Circular dichroism, the difference between the absorption coefficient of left-and right-handed circularly polarized light, is measured via circular dichroism (CD) spectroscopy [114]. CD spectroscopy is a powerful technique used to investigate secondary structure elements [115,116]. IDPs possess dynamic secondary structures that can be well assessed and characterized in an average sense by CD spectroscopy analysis [114,117]. Structural dynamics of an IDP can be reduced or promoted by altering their physical or chemical environment, which can then be quantified using CD spectroscopy. The two spectral regions used to study CD in proteins are the near-UV (250-300 nm) which correspond to the aromatic side chains and the far-UV (175-250 nm) that inform about secondary structures. Because an IDP moves through secondary structure as it changes conformations, far-UV CD spectroscopy is particularly useful for reporting the presence of alpha helices and beta sheets [118]. Time-resolved approaches using synchrotron light sources can provide information on dynamic processes in proteins down to nanosecond timescales [119], which eventually may prove useful for IDPs.

Spectroscopies
Spectroscopy is an invaluable tool for probing and studying characteristics of IDPs.

Circular Dichroism
Circular dichroism, the difference between the absorption coefficient of left-and right-handed circularly polarized light, is measured via circular dichroism (CD) spectroscopy [114]. CD spectroscopy is a powerful technique used to investigate secondary structure elements [115,116]. IDPs possess dynamic secondary structures that can be well assessed and characterized in an average sense by CD spectroscopy analysis [114,117]. Structural dynamics of an IDP can be reduced or promoted by altering their physical or chemical environment, which can then be quantified using CD spectroscopy. The two spectral regions used to study CD in proteins are the near-UV (250-300 nm) which correspond to the aromatic side chains and the far-UV (175-250 nm) that inform about secondary structures. Because an IDP moves through secondary structure as it changes conformations, far-UV CD spectroscopy is particularly useful for reporting the presence of alpha helices and beta sheets [118]. Time-resolved approaches using synchrotron light sources can provide information on dynamic processes in proteins down to nanosecond timescales [119], which eventually may prove useful for IDPs.

Fourier Transform Infrared Spectroscopy
Fourier transform infrared spectroscopy (FTIR) is another spectroscopic method used to study the secondary structure of proteins [120]. FTIR relies on the absorption of infrared light at the frequency of the sample's molecular vibrational modes. The vibrational modes of a polypeptide chain, a repeated sequence of peptide bonds inherent to proteins, can produce up to nine bands measured by FTIR, the two most studied being the amide I and amide II bands [121]. Specifically, the amide I band is used to observe secondary structure formation. FTIR is also commonly used to study the aggregation of IDPs, such as the Parkinson's disease associated IDP α-synuclein [120,122,123].

Raman Spectroscopy
Raman spectroscopy obtains its name from its use of Raman scattering, or the inelastic scattering of light, due to a system's molecular vibrations [121,124]. Comparable to FTIR spectroscopy, the measured change in energy from the incident light can be correlated to the protein's vibrational modes and secondary structure [125,126]. Raman spectroscopy can be performed at dilute concentrations which is advantageous in the study of IDPs due to their aggregation tendences at high concentrations [127]. The conformational changes of an IDP are also well characterized by Raman spectral analysis. Raman optical activity (ROA) is another Raman scattering technique that measures the change in vibrational spectra due to left-and right-circularly polarized light and can add information about secondary and tertiary structures [124,128].

Mass Spectrometry
Native mass spectrometry (MS) is a technique used in structural biology for studying the structure and stoichiometry of proteins through their mass to charge (m/z) ratio. MS has the capability to inform on multiple conformational states present in a heterogenous mixture and is often combined with other methods, such as ion mobility MS (IM-MS), which can separate the proteins by size and charge [104,129,130]. Time resolved MS has been successfully used to measure dynamic processes in proteins [131].

Hydrodynamic Characterizations
The hydrodynamic properties of a protein are necessary for conformation classification and can be determined with methods such as dynamic light scattering (DLS), FCS (see Section 2.3), size-exclusion chromatography (SEC, also known as gel filtration), and analytical ultracentrifugation (AUC) [46]. DLS, SEC, and AUC are complementary methods for studying the hydrodynamic (Stokes) radius, R H [132]. DLS is a simple and noninvasive technique that can be used to obtain information on a protein's hydrodynamic dimensions [133,134]. DLS measures the scattering of light caused by Brownian motion and has been applied to the study of IDPs with high aggregation tendencies [135,136]. SEC uses porous beads to separate molecules based on hydrodynamic dimensions [46]. AUC uses centrifugal force generated in a centrifuge to separate molecules based on their hydrodynamic properties (Figure 2). AUC can experimentally determine the sedimentation coefficient, s, which is inversely related to the Stokes radius [132]. Various combinations of these techniques, as well as molecular simulations, have been used to calculate and confirm the hydrodynamic characteristics of IDPs [78,137].

Computational Methods
All atom molecular dynamics simulation (MD simulation) is a computational method used to predict the behavior of proteins, especially when combined with parameters from data acquired via X-ray crystallography, SAXS, NMR, or other techniques [36]. MD has been increasingly applied to characterize conformational ensembles of IDPs [138][139][140][141][142]. MD simulation is a highly valuable tool for data analysis and structural modeling but is not without its limitations. Force fields that are used for MD simulations of structured proteins fail to succeed when applied to IDPs and the inhomogeneous conformational landscape occupied by any single IDP also presents modeling challenges [143]. MD simulations are a key tool in integrative structural biology due to their ability to combine information from many methods and create a unified model of a protein's structure and conformational changes [144,145].
Until recently, a protein's tertiary structure was unpredictable based on its amino acid sequence. The residue sequence in disordered regions varies in composition when compared to ordered proteins [146,147]. Several disordered regions of proteins have been predicted by a group of algorithms within the PONDR (predictor of natural disordered regions) family [148,149]. Using more than one predictor and averaging the results provide a more robust disorder profile than a single algorithm [144]. The associative memory, watermediated, structure and energy model (AWSEM) is a course-grained force field model that has been used to predict protein structure, folding, and aggregation [144,150,151]. AWSEM's optimized force fields have correctly predicted protein structures dependent solely on sequence [150,152,153]. AlphaFold, a machine learning model created by Deep-Mind, has made significant strides in the field of structural biology after successfully predicting the three-dimensional structure of proteins based on sequencing data [154]. Regions of low confidence in AlphaFold's predictions correlate to disordered regions and confirm previous estimates that more than 30% of protein regions are disordered [154].

The Integrative Structural Biology Approach to IDPs and Examples
IDPs by nature fluctuate on many timescales among wide ranges of conformations. Their conformational ensembles can be altered by accessory proteins or post-translational modification. Thus, using an integrated, multiscale approach (integrative structural biology) rather than a single isolated technique is more prudent for accurately characterizing the dynamics and fluctuating conformational landscapes inherent to IDPs. Using a battery of methods with different sensitivities, complemented by advanced computational simulations, is essential to characterize the full range of the conformation space. Studying an IDP is like putting together a jigsaw puzzle where each method provides a limited number of pieces. Method one may gather all the blue pieces together, while method two helps arrange the edges. The full picture comes together only when the information from one method can be placed into its larger context with complementary methods. Therefore, instead of relying on the limited data provided by a single experimental method, integrative structural biology is an approach that combines the data from various methods to form models and a more complete understanding of these proteins [34][35][36]61,80,140,[155][156][157][158][159][160][161][162][163].
As the application of integrative approaches to study IDPs is increasing at a rapid pace, here we will highlight only a few of the many successes. Our goal is to illustrate some different combinations of methods or cases where unexpected functions are uncovered. We will not discuss the important related topic of using combinations of methods to characterize dynamic assemblies of folded domains connected by disordered linkers [34,159,[164][165][166].

Ubiquitin
To examine the robustness of an integrative structural biology approach, the ubiquitin protein in its denatured state was observed by combining results from multiple methods [156]. Ubiquitin is a regulatory protein involved in cell regulation with a tertiary structure that is denatured as urea concentration increases. Data collected from smFRET, NMR, and SAXS had good agreement for the distance distributions for unfolded ubiquitin. Local structure and dynamics were derived from NMR restraints while the overall shape was provided by SAXS measurements. Intramolecular distances and distributions within subpopulations as well as dynamic properties of the protein's conformational changes were uncovered by smFRET. In this study, combining the results of smFRET, NMR, and SAXS provided a complete picture of the conformational ensemble of this unfolded protein.

Nucleoporins
Phenylalanine-glycine-rich nucleoporins have also been studied using an integrative structural biology approach [167]. A combination of SAXS, smFRET measurements was compared with MD simulations that used different models for solvent interactions. The ultimate agreement of experiment and simulations in this work highlights successful approaches to improve theoretical force fields used to model IDPs.

Aggregation-Prone Synaptic Proteins
The aggregation of some IDPs is associated with the pathology of diseases, such as α-Synuclein (αS) with Parkinson's disease or amyloid-β (Aβ) and Tau with Alzheimer's disease. αS has previously been investigated using X-ray diffraction [168] and NMR [157], but more recent studies [158] have used smFRET combined with MD simulations and NMR measurements to provide information on its structure and dynamics. Good agreement was found with other methods, and the conditions found to promote aggregation pointed toward possible therapeutic approaches to target αS.
Similarly, Aβ has been investigated [163]. Fluorophores were used to label both the N-and C-termini and FRET was observed in both free-diffusion and immobilized modes. Again, results aligned with previously reported data while adding information on possible reasons for aggregation of monomeric Aβ.
In the mid-1990s, before the wide acceptance of IDPs, studies of the Tau protein showed that its overall shape and conformation suggested it was similar to a denatured protein with no tertiary structure [169]. Since then, integrated structural biology has enhanced our understanding of these IDPs which are implicated in neurodegenerative disorders. There is evidence to support that both the aggregation of Tau and increased Tau-tubulin binding influence the pathology of disease. smFRET data combined with Monte Carlo simulations provide possible Tau conformations on binding to tubulin dimers [169].

Sic1
In Saccharomyces cerevisiae, Sic1 is a disordered protein involved in cell cycle regulation and DNA replication initiation [170][171][172]. Sic1 forms a complex with a subunit of ubiquitin ligase, Cdc4, after the phosphorylation of at least six of the nine Cdc4 phosphodegron (CPD) sites on Sic1, seven of which are located on the 90 residue N-terminal ( Figure 3A) [170,172]. Phosphorylation followed by ubiquitination results in the degradation of Sic1, which allows DNA replication to begin [170,171,173,174]. An integrative structural biology approach to Sic1 characterization that used NMR, SAXS, and smFRET ( Figure 3B,C) focused on the seven CPD sites on the disordered N-terminal [170]. Phosphorylated Sic1 (pSic1) has different binding properties than Sic1, but neither phosphorylation nor Cdc4 binding creates a disorder-to-order transition of Sic1. SAXS and smFRET of both Sic1 and pSic1 were constrained by including NMR-PRE data and indicated a subtle conformational change in Sic1 after phosphorylation. Analysis of SAXS and smFRET data showed that these methods were individually capable of accurately measuring the root-mean-squared radius of gyration R g and the root-mean-squared end-to-end distance R e-e , respectively. SAXS data alone show little change in conformational properties between Sic1 and pSic1; however, SAXS+PRE restrained ensembles show an expansion of R e-e which is consistent with the change in distance observed by smFRET. show smFRET efficiency histograms of Sic1 and pSic1 and end to end probability distributions. iv-vi show SAXS data for Sic1 and pSic1 and deduced Rg, which was estimated to be approximately 30 Å for Sic1 and 32 Å for pSic1 [170]. (B) is adapted with permission from [170]. Copyright 2020, American Chemical Society. (C) Upper panel displays 1 HN-15 N correlation spectra of Sic1 (black) and pSic1 (red). The lower panel shows experimentally determined secondary structural propensity (SSP) values (described in [175]) for Sic1 (black bars) and pSic1 (open bars). Note that the helical vs. extended interpretations are marked on the right axis. Red circles indicate the locations of the phosphorylation sites [175]. (C) is reproduced with permission from [175]. Copyright 2008, National Academy of Sciences, USA.

N-WASP
An integrative approach allowed characterization of the conformational ensemble of the disordered domain of the neural Aldrich syndrome protein (N-WASP) [176], which regulates actin assembly pathways [177]. MD modeling generated conformational ensembles of the protein, which were validated by NMR and SAXS measurements. Using the SAXS and NMR data to benchmark simulations and guide selection of optimal force fields allowed the MD simulations to reveal both the global and local details of the conformational ensemble of this disordered protein. The simulations provided information about the transient underlying secondary structure within the ensembles. The use of experimentally derived restraints to guide computational modeling [178][179][180] or, more generally, cross validating simulation with experiments is a powerful tool to apply to IDP studies because it provides insights into both global and local structural features of the conformational ensembles. The minimum functional KID fragment, the last 70 residues, is indicated in the purple box. (B) top three panels (i-iii) show smFRET efficiency histograms of Sic1 and pSic1 and end to end probability distributions. iv-vi show SAXS data for Sic1 and pSic1 and deduced R g , which was estimated to be approximately 30 Å for Sic1 and 32 Å for pSic1 [170]. (B) is adapted with permission from [170]. Copyright 2020, American Chemical Society. (C) Upper panel displays 1 HN-15 N correlation spectra of Sic1 (black) and pSic1 (red). The lower panel shows experimentally determined secondary structural propensity (SSP) values (described in [175]) for Sic1 (black bars) and pSic1 (open bars). Note that the helical vs. extended interpretations are marked on the right axis. Red circles indicate the locations of the phosphorylation sites [175]. (C) is reproduced with permission from [175]. Copyright 2008, National Academy of Sciences, USA.

N-WASP
An integrative approach allowed characterization of the conformational ensemble of the disordered domain of the neural Aldrich syndrome protein (N-WASP) [176], which regulates actin assembly pathways [177]. MD modeling generated conformational ensembles of the protein, which were validated by NMR and SAXS measurements. Using the SAXS and NMR data to benchmark simulations and guide selection of optimal force fields allowed the MD simulations to reveal both the global and local details of the conformational ensemble of this disordered protein. The simulations provided information about the transient underlying secondary structure within the ensembles. The use of experimentally derived restraints to guide computational modeling [178][179][180] or, more generally, cross validating simulation with experiments is a powerful tool to apply to IDP studies because it provides insights into both global and local structural features of the conformational ensembles.

SNAP-25
SNAP-25 is a SNARE protein that is a key player in neurotransmitter release. SNAP-25 is an intrinsically disordered protein that undergoes a disorder-to-order transition upon binding its partners syntaxin and synaptobrevin where it folds into colinear alpha helices to form the SNARE complex. SNARE complex formation is associated with membrane fusion of synaptic vesicles to the pre-synaptic terminal to release neurotransmitters. Integrated structural biology investigations of SNAP-25 combining smFRET, AUC, DLS, CD (circular dichroism) and SEC characterized the conformational ensemble in the isolated disordered state as consistent with a simple, semi-flexible polymer model with no underlying structure [78]. Interestingly, smFRET measurements of SNAP-25, in a binary complex with syntaxin (lacking synaptobrevin) that is on the pathway to full SNARE complex, found the transient tendency to switch between the folded alpha helix and a disordered conformation [87]. Returning to the isolated protein using additional methods of single molecule MFD, double electron-electron resonance (DEER), and MD, transient helix-coil transitions in short regions of SNAP-25 that occur in sub-millisecond timescales were observed despite a disordered fluctuating ensemble being the dominant conformational feature [161]. It was suggested that these transient alpha helix forming tendencies could play a role in priming SNAP-25 to zip into the SNARE complex rapidly upon binding with the requisite partners, assisting in the speed of neurotransmitter release. This example illustrates the value of the integrative structural biology approach for addressing measurements at the many length scales and timescales required to characterize IDP conformational ensembles, especially those with switching tendencies [82].

p27
p27 is a member of the Kip family of cyclin-dependent kinase inhibitors that plays an important role in controlling the cell cycle in eukaryotes [181]. Binding of the disordered C-terminal domain of p27 with cyclin-dependent kinase (Cdk2) and cyclins results in a disordered-to-ordered transition that has regulatory impact on the cell cycle. By integrating results from single-molecule multiparameter fluorescence spectroscopy, stopped-flow experiments, and molecular dynamics simulation, the multi-step process of assembling this fuzzy complex involving the disordered domain of p27 was mapped out [182,183].
The unbound p27 was found to be compact at the scale of a random coil by an integrated approach but rapidly fluctuating with dynamics covering orders of magnitudes of time scales from nanoseconds to milliseconds [184,185]. The interaction with its binding partners induced a multi-step process where p27 switches among conformational ensembles until favorable conformation is encountered to advance the binding process. In the end, p27 binds its partners in a more extended conformation than in isolation but remains dynamic without a fixed structure in a fuzzy complex. Elucidation of this pathway suggests that the assembly of the complex starts with a first recognition step involving conformational selections among rapidly fluctuating states, followed by a period waiting for a switch between distinct conformational sub-ensembles permitting progression to a later step where an induced fit phenomena completes the assembly. The complexity of the binding sequence is suggested to offer multiple opportunities for regulation of the assembly by other cellular signals.

PAGE4
Prostate-associated gene 4 (PAGE4) is an IDP that is expressed only in the prostate and only during early developmental stages and in the cancerous state [184]. An integrated structural biology approach identified phosphorylation-induced changes in the conformational ensemble of this IDP that were connected to impact cellular signaling pathway important to cancer progression. Combining experimental results from NMR, PRE, SAXS and smFRET studies with MD simulations revealed distinct changes in the conformational ensemble upon phosphorylation by different kinases [79,80,83,142,144,185,186]. In particular, phosphorylation by homeodomain-interacting protein kinase 1 (HIPK1) leads to a more compact ensemble, whereas phosphorylation by CDC-Like Kinase 2 (CLK2) expands the ensemble. The change in the conformational state was connected to signaling in prostate cancer by its ability to regulate interactions with the transcription factor c-Jun. HIPK1 treatment resulted in increased c-Jun dependent transcription activity in cell models of prostate cancer while CLK2 phosphorylation caused the opposite [79,80]. Given that CLK2 and PAGE4 are expressed only in androgen-dependent prostate cancer cells whereas HIPK1 is expressed in all prostate cancer cells (both androgen-dependent and -independent), these phosphorylation states that result in the expanded and contracted conformational ensembles were correlated with androgen sensitivity in prostate cancer [83,144,[185][186][187][188][189]. Modeling these changing transcription factor interactions in a cellular androgen control pathway suggested that the PAGE4 phosphorylation state could oscillate in time, which could result in temporal oscillations of androgen sensitivity in prostate cancer [187][188][189]. This model suggests direct connections between changes in the conformational ensemble of an IDP and cell phenotypes in a cancer model. Such a complete picture would not have been obtained without the use of the integrated structural biology approach.

Summary and Conclusions
It took more than two decades for IDPs to be recognized as legitimate biological entities [190] with important functions in myriad biological functions from prebiotic evolution, multicellularity, and cell fate determination to phenotypic plasticity, adaptive evolution, and disease pathology. Several of Uversky's contributions to the IDP field have shed new light on these important components of the proteome including remarkable conceptual advances from a dynamical systems perspective [191,192]. Therefore, this Special Issue of Biomolecules dedicated to VladimirUversky on his 60th birthday, is a celebration of his many contributions over the past three decades.
Author Contributions: Conceptualization, R.E., P.K. and K.W.; writing-original draft preparation, R.E.; writing-review and editing, R.E., S.R., P.K. and K.W.; graphics-S.R. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.