Structural Insights into the Intrinsically Disordered GPCR C-Terminal Region, Major Actor in Arrestin-GPCR Interaction

Arrestin-dependent pathways are a central component of G protein-coupled receptor (GPCRs) signaling. However, the molecular processes regulating arrestin binding are to be further illuminated, in particular with regard to the structural impact of GPCR C-terminal disordered regions. Here, we used an integrated biophysical strategy to describe the basal conformations of the C-terminal domains of three class A GPCRs, the vasopressin V2 receptor (V2R), the growth hormone secretagogue or ghrelin receptor type 1a (GHSR) and the β2-adernergic receptor (β2AR). By doing so, we revealed the presence of transient secondary structures in these regions that are potentially involved in the interaction with arrestin. These secondary structure elements differ from those described in the literature in interaction with arrestin. This suggests a mechanism where the secondary structure conformational preferences in the C-terminal regions of GPCRs could be a central feature for optimizing arrestins recognition.


Introduction
G protein-coupled receptors (GPCRs) are integral membrane proteins involved in signal transduction. They are central in the cellular response for a wide range of extracellular ligands, such as hormones, nucleotides, lipids, ions, photons, and neurotransmitters [1]. Their signaling outcome regulates a large number of biological functions and, therefore, their dysfunctions are linked to various pathologies [2]. Consequently, GPCRs are the target of a third of the current clinical drugs [3]. Their functional diversity comes from the existence of a large number of GPCRs (~800 members in humans), classified according to their sequence and phylogenetic analyses [4]. While they share a highly conserved core domain (7TM) composed of seven transmembrane helices connected by three extracellular (ECL) and intracellular (ICL) loops, their extracellular N-and intracellular C-termini, as well as the loops, are highly variable in both sequence and length ( Figure 1) [5,6].
Upon extracellular ligand binding, GPCR conformational rearrangements allow G protein association and G protein dependent signaling initiation. Besides G proteins, GPCRs can trigger other signaling pathways by interacting with arrestin, i.e., desensitization, internalization, and receptor trafficking [7]. Arrestin interaction requires GPCR C-terminal GPCR:arrestin complex formation leads to arrestin activation through conformational changes (for a review see [10]). However, putative conformational changes of GPCR-Cter induced by GRK phosphorylation or arrestin binding are still poorly described. These C-terminal regions are predicted to behave as intrinsically disordered regions (IDRs) [11,12]. Intrinsically disordered proteins (IDPs) and IDRs are highly dynamic proteins/regions with a low content of transient secondary structures, which make their structural characterization difficult [13]. Atomic structures of GPCR-arrestin complex revealed that the C-terminus of the vasopressin 2 receptor (V2R) and rhodopsin are partially folded on arrestin surface [13][14][15][16]. This suggests a mechanism by which the IDR undergoes conformational changes upon post-translational modification and/or binding to its target [14]. Indeed, IDPs/IDRs most likely contain some pre-formed secondary structure elements that are often involved in the recognition of specific partners and modulate the affinity [15].
In order to better understand how the structural features of the GPCR C-terminal regions could impact on their functional role, we characterized the structure of the truncated C-terminal regions of three GPCRs, namely the vasopressin V2 receptor (V2R), the ghrelin receptor type 1a (GHSR) and the β2-adernergic receptor (β2AR) (Figure 1). These three class-A receptors are important therapeutic targets [2] and are representative of the two different classes of arrestin binders [16]. The first class, which includes β2AR and GHSR, forms a transient complex with arrestin that dissociates near the plasma membrane. Thus, arrestin does not internalize with the receptor. In contrast, the second class, which includes V2R, forms a more stable complex allowing the internalization of the whole assembly.
Here, we confirmed the disordered nature of these three GPCR-Cters and the presence of transient secondary structures by a set of biophysical tools: circular dichroism (CD), multi angle light scattering (MALS), and small angle X-ray scattering (SAXS). Then, we used nuclear magnetic resonance (NMR) to probe the conformational and dynamic preferences at residue level, using a set of complementary experiments, such as secondary chemical shifts (SCS), scalar (J) and residual dipolar couplings (RDCs), paramagnetic relaxation enhancement (PRE) and relaxation [17][18][19][20]. We show that the three C-terminal regions of the chosen GPCRs displayed different transient secondary structures, which could be involved in arrestin binding. These regions could act as short linear motifs (SLiMs), partner recognition IDP segments that are embedded in poorly conserved, disordered regions. By comparison with crystallographic structures of synthetic peptides in interaction with arrestin [21][22][23], our results suggest that structural changes in these putative SLiMs occur either after phosphorylation of the C-terminal region or upon arrestin binding.
Concentration of V2R-Cter was determined by refractometry after gel filtration, while concentrations of β2AR-Cter and GHSR-Cter were estimated using absorbance at 280 nm.

Size Exclusion Chromatography-Multi-Angle Light Scattering (SEC-MALS)
The experiments were performed at 25 • C using a Superdex 75 10/300 GL column (GE HealthCare) connected to a miniDAWN-TREOS light scattering detector and an Optilab T-rEX differential refractive index detector (Wyatt Technology, Santa Barbara, CA, USA). The column was equilibrated in 50 mM BisTris pH 6.7, 50 mM NaCl, 1 mM TCEP and 0.5 mM EDTA buffer filtered at 0.1 µM, and the SEC-MALS system was calibrated with a sample of Bovine Serum Albumin (BSA) at 1 mg/mL. Samples at 1.5 mM, 0.6 mM, and 0.7 mM were prepared for V2R-Cter, GHSR-Cter, and β2AR-Cter, respectively. For each GPCR-Cter, 40 µL of sample were injected at 0.5 mL/min. Data acquisition and analyses were performed using the ASTRA software (Wyatt).

Circular Dichroism (CD)
Far UV-spectra of the C-termini were recorded in a quartz cuvette (path length 0.1 cm) at 0.08 mg/mL in H 2 O at 20 • C using a Chirascan. The ellipticity was scanned from 190 to 260 nm with an increment of 0.5 nm, an integration time of 3 s, and a constant band-pass of 1 nm. Data were treated using Chirascan and, after substraction of the buffer signal, were converted to mean residue ellipticity ([θ] MRW , mdeg.cm 2 .dmole −1 ) using Equation (1) [38]: where θ is the ellipticity (mdeg), Mw is the molecular weight (g/mol), L is the cell length (cm), C is the protein concentration (mg/mL), and n is the number of peptide bonds.

Small-Angle X-ray Scattering (SAXS) Measurement and Analysis
Synchrotron radiation SAXS data were acquired for GPCR-Cters at the SWING beamline at the SOLEIL synchrotron (Saint-Aubin, France) [39] using an X-ray wavelength of 1.03 Å and a sample-to-detector distance of 1.99 m. Samples were measured at 15 • C and at two concentrations, 5 mg/mL and 10 mg/mL, for all GPCR-Cters, in 50 mM BisTris pH 6.7, 50 mM NaCl and 2 mM DTT buffer. Before exposure to X-rays, 45 µL of sample were injected into 3 mL Superdex 75 5/150 GL column (GE HealthCare) at 0.2 mL/min, preequilibrated into the same buffer as the samples. The intensity was measured as function of the magnitude of the scattering vector, s, using Equation (2) [40]: where θ is the scattering angle and λ is the X-ray wavelength. The scattering patterns of the buffer were recorded before the void volume of the column (1 mL). The scattering profiles measured covered a momentum transfer range of 0.002 < s < 0.5 Å −1 . Data were processed using CHROMIX from ATSAS [41] software package to automatically select frames corresponding to buffer and sample, and performed buffer subtraction. The scaled and averaged SAXS curves were analyzed using Primus from ATSAS software package.

Nuclear Magnetic Resonance (NMR) Spectroscopy
All NMR experiments were performed on a Bruker Bruker Avance III 700 MHz spectrometer, except for the 3D assignment of β2AR-Cter performed on a 800 MHz, and for the 3 J HNHA of GHSR-Cter performed on a 500 MHz. The 700 MHz and 800 MHz spectrometers are equipped with a cryogenic triple-resonance ( 1 H, 15 N, 13 C) probe and shielded z-gradients. All NMR experiments were recorded at 20 • C in a buffer (named NMR buffer) composed of 50 mM Bis-Tris pH 6.7, 150 mM NaCl, 1 mM EDTA, 0.5 mM TCEP, 5% D 2 O (Eurisotop), and 5 mM DSS-d6 (2,2-dimethyl-2-silapentane-5-sulfonate, Sigma) as internal reference [42]. All experiments used the pulse sequences provided by Bruker Topspin 3.2. Squared cosine apodization was used in indirect dimensions, prior to zero-filling and Fourier transformation using TOPSPIN (version 4.0.6, Bruker) and data processing was performed using NMRFAM-SPARKY (version 1.414, [43]). For each NMR experiments, concentrations of GPCR-Cters were indicated in Table S1. For all NMR experiments, data were measured for all residues of C-terminus regions excepted proline residues, the residue A339 of β2AR-Cter, and the first N-terminal residue. For the sequential assignment of the 13 C/ 15 N GPCR C-terminus of V2R, GHSR and β2AR, HNCO, HN(CA)CO, HNCA, HN(CO)CA, CBCA(CO)NH and HNCACB triple resonance 3D experiments were recorded. H N , N, CO, C α and C β nuclei of all residues were assigned, expected the first N-terminal residue, A339 for β2AR-Cter and proline residues.
2.6.2. Secondary Chemical Shift (SCS) 13 C α and 13 C β chemical shifts were used to calculate Secondary Chemical Shift (SCS) by subtraction of experimental chemical shifts (from the 3D experiment) from randomcoil chemical shift computed by POTENCI database [44,45]. SCS were calculated for all residues of C-terminal domains excepted proline residues, the N-terminal residue, A339 for β2AR-Cter, and the last residue.
2.6.3. 3 J HNHA Coupling 3 J HNHA scalar coupling measurements were obtained according to Vuister and Bax [46]. Briefly, HNHA experiments were recorded on 15 N-labelled GPCR C-termini. Intensity of the cross-peak (S cross ) and intensity of the corresponding diagonal peak (S diag ) were extracted using Sparky. They were used to calculate the 3 J HNHA scalar coupling of each amino acid using the Equation (3): where 2ξ is the total evolution time for the homonuclear 3 J HNHA coupling, which has been set to 26.1 ms. 3 J HNHA scalar coupling were measured for all residues of C-terminal tails excepted proline residues and the first glycine residue. Random coil scalar coupling were predicted using RC_3JHNHa server [47].

1 H-15 N Residual Dipolar Couplings (RDCs)
RDCs were obtained by recording 2D IPAP HSQC spectra [48] in isotropic and anisotropic media. The anisotropic media were obtained by adding a 5% (w/v) mixture of polyoxyethylene 5-lauryl ether (PEG/C12E5) (Sigma) and 1-hexanol (Sigma) in a molar ratio of 0.85 [49] or by adding~20 mg/mL of filamentous phage Pf1 (Asla biotech) [50]. Spectra were recorded on 15 N-labelled GPCR C-termini in alcohol and phage media. 1 D NH dipolar couplings were measured from the difference of doublet peak positions in the 15 N dimension measured in the anisotropic (J + D) and isotropic (J) spectra.

15 N Relaxation Experiments
Relaxation data were measured on 15 N-labelled GPCR C-termini for all residues except proline residues and the two first N-terminal residues. Heteronuclear 15 N{ 1 H}-NOE values were determined from two experiments with on-(saturated spectrum) and off-resonance 1 H saturation (unsaturated spectrum) that were recorded in an interleaved manner. The saturation time by 120 • pulses (~10 kHz) was set to 6 s and the recycle delay to 6 s. NOEs values were obtained from the ratio of intensities measured in the saturated (I) and unsaturated (I 0 ) spectra. Longitudinal (R 1 ) and transversal (R 2 ) relaxation rates were measured through acquisition of 15 N-HSQC spectra with different relaxation delays: 10, 50, 100, 200, 400, 600, 800, 1000 ms for R 1 , and 16, 32, 64, 96, 160, 240, 480, 640 ms for R 2 . For each peak, the intensity was fitted to a single exponential decay using Sparky [43] to obtain the relaxation parameters. For all relaxation parameters, three residues at Nand C-termini were discarded from the calculation of average values due to their inherent higher flexibility. The C378A or C406A variants of 15 N β2AR-Cter were labeled on the remaining cysteine using 3-(2-Iodoacetamodi)-proxyl (Merck). Paramagnetic samples were recorded with a recycling delay of 2 s. Reference diamagnetic samples were recorded in the same conditions after the addition of 5 mM fresh ascorbic acid, pH 6.7, in the NMR tube. PRE were analyzed by measuring the peak intensity ratios (I para /I dia ) between two 15 N-HSQC spectra of paramagnetic and diamagnetic samples. The theoretical profile expected for a strictly random coil polymer was calculated according to [51].

Ensemble Calculations
Ensembles of explicit models were generated using Flexible-Meccano (FM) [52], which sequentially builds peptide planes based on amino acid specific conformational propensity and a simple volume exclusion term. To account for deviations from a random-coil description, different structure ensembles of 50,000 conformers were computed including user-defined local conformational propensities in different regions of the protein. Local conformational propensities were first localized using the consensus of all NMR data, and then were adjusted by comparing back-calculated and experimental 1 D HN RDCs to get the lowest X 2 (for more details [53]). For β2AR-Cter, a long-range contact of 15 Å between two regions affected by the probe, i.e., from residues 338 to 357 and from residues 367 to 386 (regions in grey in Figure S7b), was introduced to get a better agreement between back-calculated and experimental 1 D HN RDCs.

The Disordered C-Termini of V2R, GHSR, and β2AR Contain Transient Secondary Structures
As is the case for many C-terminal domains of GPCRs [54], the C-terminal regions of V2R, GHSR, and β2AR are shown to be disordered ( Figure 2) and are predicted to contain transient secondary structures by various computational tools ( Figure 3). In fact, they are composed of more than 50% of disorder-promoting residues (Arg, Gly, Gln, Ser, Pro, Glu, and Lys) ( Figure 3a) [55]. In the Charge-Hydropathy plot [56] (Figure S1), V2R-Cter is at the disorder-order boundary, while β2AR-Cter (pink) and GHSR-Cter (green) appeared in the cluster of disordered proteins (yellow). Sequence based disorder prediction by a set of six predictors (Table S2) showed an overall disordered state (values higher than 0.5) for the three C-termini ( Figure 3b). More specifically, prediction by PrDOS, DISOPRED3, and Espritz-NMR showed less disordered regions from residues~351 to 365 for V2R-Cter and from residues~343 to 359 for GHSR-Cter (Figure 3b). β2AR-Cter is also predicted to be less disordered from~345 to 356 and from~369 to 405 according to PrDOS, DISOPRED3, and Espritz-NMR predictors ( Figure 3b). Interestingly, these regions are predicted to contain helical conformations by a set of distinct secondary structure predictors (Figure 3c, Table S3).
In order to characterize these domains experimentally, we expressed and purified the intrinsically disordered and soluble C-terminal regions of V2R (343-371), GHSR (339-355), and β2AR (342-413). SEC-MALS analysis revealed a single elution peak for each protein at a volume that corresponded to the volume of standard proteins with a molecular mass greater than 13 kDa. However, the masses derived from a MALS analysis are 3.9 (±3.2%), 3.4 (±3.7%), and 8.6 (±1.1%) kDa for V2R-Cter, GHSR-Cter, and β2AR-Cter, respectively ( Figure 2a). This is in agreement with the expected molecular weight of their monomeric forms (3.3, 3.6, and 8.2 kDa, respectively). Molecular masses were confirmed by mass spectrometry (MS) with 3.264, 3.672, and 8.175 kDa obtained for V2R-Cter, GHSR-Cter, and β2AR-Cter, respectively. The behavior of V2R-Cter, GHSR-Cter, and β2AR-Cter on a SEC column is typical of a disordered protein, with a smaller elution volume than expected for globular proteins of the same molecular mass [57]. In addition, far UV CD spectra showed a minimum around 198 nm, characteristic of unfolded proteins [58] (Figure 2b), and a negative shoulder around 222 nm, suggesting the presence of residual secondary structures [59] (Figure 2b). The Kratky plots extracted from SAXS data were typical of disordered regions with no clear maximum and a monotonic increase along the momentum transfer range ( Figure S2) [18,60,61]. Additionally, 15 N-HSQC spectra of the three Cterminal regions showed a reduced amide proton spectral dispersion (around 1 ppm) typical of a disordered protein (Figure 2c) [18,19].
Altogether, bioinformatics analyses and experimental data confirm the disordered nature of the C-terminal regions of V2R, GHSR, and β2AR [62]. Furthermore, these results suggest the presence of residual secondary structures.

Location of the Transient Secondary Structures in V2R, GHSR, and β2AR C-Termini
In order to localize the transient secondary structures of these three GPCR C-terminal regions, we performed a NMR study, as described in [53]. Before NMR investigation, the backbone assignments were performed on the studied C-terminal domains of GPCRs ( Figure S3) (BMRB accession codes, respectively: 51318, 51317, and 51316). Then, we used a consensus of four NMR parameters to highlight the secondary structure content. First, 13 C secondary chemical shifts (SCS), which are highly sensitive to the backbone conformations, were computed (Figure 3d and Figure S4b-g) [45,63]. Three-bond H N -H a J-coupling constants ( 3 J HNHA ), which are related to the ϕ angles of the polypeptide chain, were compared to random coil scalar coupling [46,47]. Residual dipolar couplings (RDCs), which are related to the orientation of the amide backbone vector to the magnetic field [64], give information on the location and secondary structure type. These data were also compared to back-calculated RDCs on random-coil ensembles computed with Flexible-Meccano [52,65]. Finally, dynamic parameters, such as heteronuclear 15 N{ 1 H}-NOEs, longitudinal (R 1 ), and transverse (R 2 ) relaxation rates, give information on local backbone mobility on the ps-ns timescale, while R 2 is also sensitive to motions on the µs to ms timescale (chemical or conformational exchange processes). Thus, R 2 /R 1 ratio reveals slow conformational motions or conformational/chemical exchange. Depending on the NMR experiments, the GPCR-Cter sample concentrations varied, but did not affect the structure (Table S1, Figure S3). In order to characterize these domains experimentally, we expressed and purified the intrinsically disordered and soluble C-terminal regions of V2R (343-371), GHSR (339-355), and β2AR (342-413). SEC-MALS analysis revealed a single elution peak for each protein at a volume that corresponded to the volume of standard proteins with a molecular mass greater than 13 kDa. However, the masses derived from a MALS analysis are    Figure 3. Bioinformatics predictions, secondary structure propensity, backbone dynamics, and RDC conformational profile of the disordered V2R-Cter (blue), GHSR-Cter (green), and β2AR-Cter (purple). (a) GPCR C-termini are composed of more than 50% disorder-promoting residues (in orange). Ordered-promoting residues are indicated in purple. Secondary structures obtained by the consensus of all NMR data are indicated under the sequence and highlighted according to their respective color code. Helix, β-strand, extended conformation (β-strand or polyproline helix 2, PPII), and turns are represented as red cylinder, blue arrow, purple box, and green bars, respectively; (b) disorder prediction by SPOT-Disorder 2 (red), SPOT-Disorder Single (black), PONDR-FIT (purple), PrDOS (green), DisPro (orange), DISOPRED 3 (grey), and Espritz-NMR (blue). The disorder/order threshold (0.5) is indicated in black line; (c) secondary structure prediction by SOPMA, PSIPRED, JPRED4, PSSpred, SPOT 1D, and SPIDER 3 web-servers are represented for helices (red cylinder), strands (blue arrow), and turns (green); (d) computed secondary structure SCS Cα-SCS Cβ using random coil chemical shifts from POTENCI; (e) heteronuclear 15

Vasopressin V2 Receptor C-Terminal Domain (V2R-Cter)
For V2R-Cter, a central helical region was identified from residues 356 to 364, called V2-1 from now on ( Figure 3). Indeed, 13 C SCS presented positive values in this region (Figure 3d). The presence of this helix is in agreement with disorder and secondary structure predictions (Figure 3b,c). Globally, in V2R-Cter, 32% of the residues displayed 3 J HNHA scalar coupling values below 6 Hz, normally assigned to helical conformations, and the rest were between 6 and 8 Hz, consistent with a random coil (RC) polypeptide chain ( Figure S4i). Moreover, experimental 3 J HNHA scalar couplings for the V2-1 region were lower than the predicted ones ( Figure S4h) [47], substantiating its helical conformation. Further evidence for transient secondary structure elements in V2-1 came from dynamical parameters. V2R-Cter showed low heteronuclear 15 N{ 1 H}-NOE values as expected for a disordered polypeptide. However, in the V2-1 region, heteronuclear NOE values were above the average (−0.09 ± 0.01), suggesting an enhanced rigidity in the region (Figure 3e). Residues of V2-1 adopted larger R 2 values than the average (2.80 ± 0.05 Hz), suggesting the presence of slow dynamic processes. R 1 and R 2 /R 1 also displayed slightly higher values for V2-1 than their averages, 1.75 ± 0.03 Hz and 1.59 ± 0.04, respectively ( Figure  S5). This transient helical secondary structure in V2-1 was also highlighted by 1 D NH RDC values. RDCs were measured in alcohol mixture (C 12 E 5 /hexanol media) and in filamentous bacteriophage Pf1. However, the homogeneity of the aligned sample was checked on the quadrupolar splitting of D 2 O and was higher in alcohol medium (27 Hz) than in Pf1 medium. While RDC values were mainly negative, as observed in disordered proteins, the segment encompassing V2-1 displayed higher values than those expected for a random coil, which was in agreement with the presence of a helix in this region (Figure 3f).

Ghrelin Receptor C-Terminal Domain (GHSR-Cter)
For GHSR-Cter we identified two transient secondary structures: from residues 345 to 348 (called GH-1) and from residues 356 to 361 (called GH-2) (Figure 3). These regions were predicted to form helices (Figure 3c) and SCS showed positive values, suggesting helical conformations (Figure 3d). The SCS profile also presented high positive values for 354-355 that could be related to the presence of a turn as predicted by the secondary structure predictor SOPMA. 3 J HNHA scalar couplings confirmed that GHSR-Cter was mostly disordered (50%) with transient helical conformations (46%) and a small portion of extended conformation (4%) ( Figure S4i). The comparison of experimental and random coil 3 J HNHA scalar couplings was in agreement with the presence of transient helices in GH-1 and GH-2 ( Figure S4h). Additionally, in these two regions, heteronuclear NOE, R 1 , R 2 , and R 2 /R 1 showed higher values than their averages, 0.11 ± 0.01; 2.11 ± 0.04 Hz; 3.17 ± 0.06 Hz and 1.49 ± 0.04, respectively (Figure 3e and Figure S5). This indicated less flexibility in these regions, suggesting some local structuration. When measuring RDCs, a small interaction between the alcohol mixture and GH-1 region (chemical shift differences > 0.01 ppm between 15 N-HSQC spectra) was observed ( Figure S6); thus we analyzed RDCs measured in phage Pf1 medium (quadrupolar splitting of 18 Hz). 1 D NH RDCs exhibited lower (GH-1 region) and higher (GH-2 region) values than those expected from a random coil, suggesting an extended and a helical conformation in these regions, respectively (Figure 3f).

β2-Adrenergic Receptor C-Terminal Domain (β2AR-Cter)
For β2AR-Cter, we identified two regions forming secondary structures: from residues 349 to 357 (called β2-1) and from 368 to 376 (called β2-2). They were predicted to be more ordered and to contain helical secondary structures (Figure 3a,b). However, SCS presented negative values for β2-1 and positive values for β2-2, corresponding to an extended and a helical conformation, respectively (Figure 3d). 3 J HNHA scalar couplings were consistent with an overall unstructured protein (73% of the residues) with a small content of helical (25%) and extended conformations (1%) ( Figure S4i). In β2-2, experimental scalar couplings were lower than those predicted for a random coil, suggesting a helical conformation in this region. Note that glycines are not predicted by the RC_3JHNHa server; thus scalar couplings in β2-1 could not be properly compared to random-coil values [47] ( Figure  S4h). In β2-1 and β2-2, heteronuclear NOE and R 2 values were above their average, 0.11 ± 0.01 and 3.44 ± 0.01 Hz, respectively. This indicated restricted flexibility in these regions (Figure 2e and Figure S6). However, in β2-2, R 1 remained flat. Consequently, R 2 /R 1 displayed larger values than the average (1.83 ± 0.03), suggesting conformational fluctuations on the µs-ms time-scale. RDC measurement in Pf1 medium showed a small interaction with β2-2 (chemical shift differences > 0.01 ppm) ( Figure S6); thus we used RDC data extracted from the alcohol mixture (quadrupolar splitting of 31 Hz). Compared to random coil values computed with FM, experimental RDCs showed more positive values in β2-1 and more negative values in β2-2 (Figure 3f), suggesting a helical and an extended conformation, respectively. However, RDCs are not only sensitive to local structuration, such as SCS, but they can also probe long-range transient contacts [66,67]. Indeed, paramagnetic relaxation enhancement (PRE) data of the C406A variant, where the paramagnetic probe in C378 lies just after β2-2 region, showed reduction of the intensity ratio in the N-terminal part of β2AR-Cter, including β2-1 region and its N-flanking region, revealing long-range contacts between the N-terminal part of β2AR-Cter and β2-2 Cflanking region (respectively, from residues 338 to 357 and 367 to 386). PRE affected regions are highlighted in grey in Figure S7b. These long-range interactions could affect the overall profile of RDCs, probably explaining the absence of consensus between SCS and RDC data (see below). The second PRE dataset, measured in the C378A variant, where the paramagnetic probe lies at the C-terminal region (C406), did not induce substantial reduction of the intensity ratio in β2AR-Cter, indicating the overall disorder of the protein.
Interestingly, a slight reduction of intensity in the N-terminal part β2AR-Cter confirmed the presence of fuzzy long-range contacts between the N-and C-terminal parts of the protein ( Figure S7c).

Conformational Ensemble of GPCR-Cters
To further illuminate the presence of transient secondary structures or turns in the three C-terminal domains of GPCRs, we built biased ensembles using Flexible Meccano (FM) (Figure 3f). Ensembles of 50,000 conformers were built for V2R-Cter, GHSR1a-Cter, and β2AR-Cter, respectively, and were used to compute back-calculated RDCs (Figure 3f). In these ensembles, we added as constraints the transient secondary structures determined by the consensus analyses of all NMR data to improve the agreement between experimental and back-calculated RDCs. Then, the population of the local conformational propensities and turns were adjusted by monitoring the agreement using χ 2 . The quality of these ensembles was evaluated by comparing the back-calculated RDCs with the experimental ones and was optimized until obtaining the lowest possible χ 2 (near 1). RDCs were measured in alcohol mixture for V2R-Cter and β2AR-Cter or in phage Pf1 medium for GHSR1a-Cter. For V2R-Cter, the best agreement (χ 2 = 1.41) was obtained when 5% of α-helix was imposed in V2-1 and when poly-proline helices II (PPII) were added at 25%, 35%, and 70% in position 344-347, 349-350, and 368-370, respectively. For GHSR1a-Cter, the best ensemble (χ 2 = 1.49) was obtained by adding 8% of α-helix in GH-2 and 20% of type 1 β-turn that was predicted for 353 to 354. Additional secondary structures were added (residues 337-338: 50% of type I β-turns, 340-341: 15% of type I β-turns: 343-344: 5% γ-turn; 363-366: 10% of β-strand). For β2AR-Cter, the lowest χ 2 (χ 2 = 1.36) was obtained when the long-range contact (15 Å) identified with PRE between regions surrounding β2-1 (from residues 338 to 357) and β2-2 (from residues 367 to 386) was incorporated to the model (see details in Materials and Methods 2.6.7., and PRE affected regions are highlighted in grey in Figure S7b). These results substantiate the presence of transient long-range interactions in β2AR-Cter. In β2-1, two type I β-turns at 50% were added, and in β2-2, a type II Poly-Proline helix (PPII) at 10% was imposed. This ensemble was also constrained by two other PPII in position 380-383 and 391-394, a type I β-turn in position 388-389 and a helix from residues 399 to 402.
In all C-terminal domains, the introduction of these secondary structures resulted in a better description of the experimental data, suggesting a more accurate description of the conformational ensembles.

Discussion and Conclusions
All GPCRs share a conserved and folded 7TM domain involved in the signal transmission. Conversely, the extracellular regions (N-terminal and C-terminal domains and loops) are rather diversified in length and sequence [4] and are involved in the functional properties of GPCRs, such as, respectively, ligand and partner binding. It is interesting to note that theses extracellular regions are predicted to contain intrinsically disordered regions (IDRs) [5,6], which could play key roles in GPCR interaction (for review [68,69]). Indeed, IDRs explore an astronomical number of conformations in solution that we assume in fast equilibrium, and very often contain pre-formed secondary structure elements. In the majority of cases, these transient secondary structures, or short linear motifs (SLIMs), are involved in the binding process with their partner [15], which makes IDPs very well suited agents for signaling processes, such as arrestin:GPCR interaction. Thus, the characterization of the structural features of these extracellular regions of GPCRs is crucial to reveal the molecular basis of signaling and cell regulation [70]. For instance, it was reported that the truncated C-terminal domain of GPR50 (GPR50-Cter) translocates to the nucleus and directly regulates gene transcription; thus the cleavage of this Cter has an unconventional signaling mode of GPCRs [70]. Intriguingly, this GPR50 receptor has been deeply remodeled through evolution by the mutation of numerous residues and by the addition of a long C-terminal domain. Indeed, the GPR50 homolog found in lower vertebrates was lacking the Cter of the human GPR50 [71]. This illustrates how the study of the isolated C-terminal domain of GPCRs is of relevant importance per se for understanding the multitude of signaling pathways regulated by the C-terminal domains of GPCR.
With the hypothesis that the C-terminal domains of GPCRs contain partially structured elements involved in signaling pathways, we characterized the free state of the C-terminal domains from three commonly studied GPCRs, the vasopressin V2 receptor (V2R-Cter), the ghrelin receptor type 1a (GHSR-Cter) and the β2-adernergic receptor (β2AR-Cter). These three class-A receptors are important therapeutic targets [2] and present different affinities for arrestins [16].
The structural characterization of these disordered C-termini was challenging, due to their inherent flexibility, and required the synergistic application of several biophysical tools. Firstly, by SAXS, MALS, and CD, we confirmed that V2R-Cter, GHSR-Cter, and β2AR-Cter are IDRs containing secondary structures. Then, we used NMR in combination with computationally generated ensembles, to locate and identify these residual secondary structure elements. The C-terminus of V2R displayed a central helix from residues 356 to 364 (V2-1). The C-terminus of GHSR encompassed two helixes from 345 to 348 (GH-1) and from residues 356 to 361 (GH-2). The C-terminus of β2AR contained two structured regions: an extended conformation from 349 to 357 residues (β2-1) and a helix from 368 to 376 (β2-2) (Figure 4). Moreover, the propensities of these secondary structures were low (~25% on average), which illustrate the high flexibility of GPCR-Cters and their ability to adopt distinct conformations in solution, a general feature of IDPs/IDRs. V2R-Cter, GHSR-Cter, and β2AR-Cter are variable in length and in sequence ( Figure 1). However, it is interesting to note that their residual secondary structures either encompass or are next to residues known to be phosphorylated by GRKs (Figure 4). It is accepted that the phosphates of the C-terminal domains of GPCRs are important for the interaction with arrestin (for a review see [10]). This strongly suggests that the transient secondary structures that we have identified are directly involved in arrestin binding. residues 356 to 364 (V2-1). The C-terminus of GHSR encompassed two helixes from 345 to 348 (GH-1) and from residues 356 to 361 (GH-2). The C-terminus of β2AR contained two structured regions: an extended conformation from 349 to 357 residues (β2-1) and a helix from 368 to 376 (β2-2) (Figure 4). Moreover, the propensities of these secondary structures were low (~25% on average), which illustrate the high flexibility of GPCR-Cters and their ability to adopt distinct conformations in solution, a general feature of IDPs/IDRs.  [9] for β2AR, [72] for GHSR, and [73] for V2R are indicated in the sequence. Secondary structures obtained by NMR secondary structure consensus are indicated under the sequence, according to Figure 3. V2R-Cter, GHSR-Cter, and β2AR-Cter are variable in length and in sequence ( Figure 1). However, it is interesting to note that their residual secondary structures either encompass or are next to residues known to be phosphorylated by GRKs (Figure 4). GRK2 (red) and GRK6 (blue) phosphorylated sites according to [9] for β2AR, [72] for GHSR, and [73] for V2R are indicated in the sequence. Secondary structures obtained by NMR secondary structure consensus are indicated under the sequence, according to Figure 3.
Interestingly, V2-1 region has been characterized in complex with arrestin in crystallographic and cryo-EM structures ( Figure 5) [21,22,74]. These structures were obtained using a fully synthetic phospho-peptide of V2R C-terminal domain (V2Rpp), truncated or attached to other receptors (chimeric receptor) ( Figure 5), and each of these complexes was stabilized with a Fab30 antibody. In these structures, the central region of V2Rpp adopts a β-strand that interacts with the N-domain of arrestin. Here, we show that in solution, this SLiM displays a helical structure instead of an extended conformation. This result suggests that a conformational change must occur upon binding to arrestin and/or phosphorylation, a feature that has been found in several SLiMs [75]. Moreover, this structural transition might be at the basis of the molecular regulation of GPCR:arrestin interaction. Indeed, β-strand formation has been proposed to serve as a general mechanism by which arrestins recognize the phosphorylated carboxy-terminal domains of receptors [21]. In the pre-structuration profile of GHSR-Cter and β2AR-Cter, only GH-2 and β2-2 are in helical conformations as found for the V2-1 region. Thus, we can hypothesize that these regions interact with arrestin with a similar mechanism to the one found for V2R, and that their basal conformation changes upon arrestin binding and/or phosphorylation. It is accepted that the phosphates of the C-terminal domains of GPCRs are important for the interaction with arrestin (for a review see [10]). This strongly suggests that the transient secondary structures that we have identified are directly involved in arrestin binding.
Interestingly, V2-1 region has been characterized in complex with arrestin in crystallographic and cryo-EM structures ( Figure 5) [21,22,74]. These structures were obtained using a fully synthetic phospho-peptide of V2R C-terminal domain (V2Rpp), truncated or attached to other receptors (chimeric receptor) ( Figure 5), and each of these complexes was stabilized with a Fab30 antibody. In these structures, the central region of V2Rpp adopts a β-strand that interacts with the N-domain of arrestin. Here, we show that in solution, this SLiM displays a helical structure instead of an extended conformation. This result suggests that a conformational change must occur upon binding to arrestin and/or phosphorylation, a feature that has been found in several SLiMs [75]. Moreover, this structural transition might be at the basis of the molecular regulation of GPCR:arrestin interaction. Indeed, β-strand formation has been proposed to serve as a general mechanism by which arrestins recognize the phosphorylated carboxy-terminal domains of receptors [21]. In the pre-structuration profile of GHSR-Cter and β2AR-Cter, only GH-2 and β2-2 are in helical conformations as found for the V2-1 region. Thus, we can hypothesize that these regions interact with arrestin with a similar mechanism to the one found for V2R, and that their basal conformation changes upon arrestin binding and/or phosphorylation. Figure 5. A conformational change must occur in the C-terminal region of V2R (blue) upon binding to arrestin and/or phosphorylation. Comparison of the free, in solution state of V2R-Cter to its bound state identified in complexes between a fully phosphorylated phospho-peptide of vasopressin V2 C-terminus (V2Rpp) and arrestin-2. On top is represented the sequence of human V2R C-terminus. Residues known to be phosphorylated by GRKs are colored in green [73]. Below, V2R-Cter sequences used in each work are represented in dashed lines. Residues encompassing the binding regions are identified in red and phosphorylated residues are noted as p. Stars (*) indicate that the V2R C-terminus is fused to another receptor (PDB: 6U1N, muscarinic receptor; 6TKO: β1-adrenergic receptor). Helices and β-strands are represented as red cylinders and blue arrows, respectively. In the frame, the PDB structure of V2R:arrestin-2 complexes is shown [21]. Phosphorylated residues of V2Rpp (in blue) are highlighted as sticks.
Until now, the free states of GPCR C-termini were poorly characterized due to their high flexibility. Comparison of our data with GPCRs:arrestin complexes showed that a conformational change is expected after GRK phosphorylation and/or arrestin binding. The phospho-barcode model states that distinct GRK phosphorylation patterns at the C-termini of GPCRs lead to distinct arrestin conformations and outcome functions [8,9,72,76]. Thus, we can anticipate that phosphorylation would dictate the signaling of Figure 5. A conformational change must occur in the C-terminal region of V2R (blue) upon binding to arrestin and/or phosphorylation. Comparison of the free, in solution state of V2R-Cter to its bound state identified in complexes between a fully phosphorylated phospho-peptide of vasopressin V2 C-terminus (V2Rpp) and arrestin-2. On top is represented the sequence of human V2R C-terminus. Residues known to be phosphorylated by GRKs are colored in green [73]. Below, V2R-Cter sequences used in each work are represented in dashed lines. Residues encompassing the binding regions are identified in red and phosphorylated residues are noted as p. Stars (*) indicate that the V2R C-terminus is fused to another receptor (PDB: 6U1N, muscarinic receptor; 6TKO: β1-adrenergic receptor). Helices and β-strands are represented as red cylinders and blue arrows, respectively. In the frame, the PDB structure of V2R:arrestin-2 complexes is shown [21]. Phosphorylated residues of V2Rpp (in blue) are highlighted as sticks.
Until now, the free states of GPCR C-termini were poorly characterized due to their high flexibility. Comparison of our data with GPCRs:arrestin complexes showed that a conformational change is expected after GRK phosphorylation and/or arrestin binding. The phospho-barcode model states that distinct GRK phosphorylation patterns at the C-termini of GPCRs lead to distinct arrestin conformations and outcome functions [8,9,72,76]. Thus, we can anticipate that phosphorylation would dictate the signaling of GPCRs by modulating the folding of their C-termini and/or their folding upon binding. Phosphorylation has already been described as a regulator of IDP folding mechanism for biological function [77]. To test this hypothesis, the characterization of the secondary structure profile for each phosphorylation pattern of GPCR C-termini and the comparison with the basal profile described in this study will be the key to understanding how arrestin dependent signaling pathways are modulated.