Exploring Mammalian Genome within Phase-Separated Nuclear Bodies: Implications for the Regulation of Gene Expression

The importance of genome organization at the supranucleosomal scale in the control of gene expression is increasingly recognized today. In mammals, Topologically Associating Domains (TADs) and the active / inactive chromosomal compartments are two of the main nuclear structures that contribute to this organization level. However, recent works reviewed here indicate that, at specific loci, chromatin interactions with nuclear bodies could also be crucial to regulate genome functions, in particular transcription. They moreover suggest that these nuclear bodies are membrane-less organelles dynamically self-assembled and disassembled through mechanisms of phase separation. We have recently developed a novel genome-wide experimental method, High-salt Recovered Sequences sequencing (HRS-seq), which allows the identification of chromatin regions associated with large ribonucleoprotein (RNP) complexes and nuclear bodies. We argue that the physical nature of such RNP complexes and nuclear bodies appears to be central in their ability to promote efficient interactions between distant genomic regions. The development of novel experimental approaches, including our HRS-seq method, is opening new avenues to understand how self-assembly of phase separated nuclear bodies possibly contributes to mammalian genome organization and gene expression.


Introduction
Several physical properties of nuclear organization are critical for regulating mammalian gene expression. In interphase, the genome is highly compacted to fit into the limited space of the cell nucleus while, at the same time, it remains fully accessible to multiple interactions involving cis-and trans-acting genomic elements and RNA/protein factors. Such a paradoxical achievement of a compact but dynamic genome is solved not only by packaging the genome into the chromatin nucleofilament, but also through a complex compartmentalization of the nucleus that contributes to the functional genome organization at the supranucleosomal scale (i.e. encompassing few tenths of kb to few Mb of DNA). The functional role of 3D genome organization has thus become an important component in the study of mammalian gene expression [1].
Another paradigm has been recently re-examined and developed : biomolecular condensates, grounded in the classical physical notion of phase separation [2]. While the use of this concept in a biological context dates back the old notion of coacervate, its relevance has been recently renewed by technological advances allowing in-vivo observations and mechanistic investigations [3].
Phase separation describes the spontaneous formation of a two-phase system. From a physical point of view, it covers not only the demixing of oil and water, but also the spatial segregation that can arise in aqueous solutions, when the attraction between the solute molecules is energetically favored compared to the interaction between these molecules and the aqueous solvent. The balance between interaction energies and thermal motion or the ensuing diffusion, described by the free energy of the system, can lead in appropriate conditions to the spatial segregation of two phases of different concentrations [4]. This phenomenon is known as liquid-liquid phase separation. Indeed, self-separated droplets display several features of a liquid phase: they are dense (as opposed to gases), display no rigid order (as opposed to crystals or liquid crystals), and their molecules remain mobile (as opposed to solids and gels), with permanent exchange between the two phases. While these droplets display fluid behaviour, as the fusion of adjacent droplets into larger ones and a shape determined by surface tension, their composition, particularly under biological constraints, make them far more complicated than a mere liquid. Experimental strategies are thus developed to assess the presence and specificity of phase separation inside the cell [5].
Phase separation has been first recognized in the cytoplasm, as a mechanism of formation of stress granules and P-bodies [4]. It has been more recently invoked in the nucleus, for instance for the formation of membrane-less organelles also known as nuclear bodies. Much work is now devoted to identifying the hallmarks of in-vivo phase separation and devising suitable protocols to study it [6]. In this review, we will first examine the proposal that nuclear compartments are phase-separated and could influence transcriptional regulation through their association with specific genomic sequences [7,8]. We will then present a novel experimental approach, HRS-seq, to test this working hypothesis.

Compartmentalization of chromatin interactions
In the past decade, the advent of sophisticated imaging techniques and molecular biology approaches based on proximity ligation assays (3C/Hi-C), has revealed that, beyond the compaction achieved by packaging the DNA molecule at the nucleosomal level, chromatin is also organized within the three-dimensional (3D) space of the nucleus [9,10]. This 3D chromatin folding displays nested features, the most acknowledged being chromatin loops and topologically associating domains (TADs) where preferential cis-long-range contacts are observed [11]. A higher-order organization level also exists that partly covers the old distinction between euchromatin and heterochromatin: the active (A) and inactive (B) chromosomal compartments [12]. While cohesin and CTCF proteins are required for TAD organization, these factors are dispensable for the maintenance of chromosomal compartments, which rely on different organization principle [13,14]. Furthermore, while TADs are essential for cell-specific genome organization and function [1], they appear to be quite stable between cell types, and even between organisms along evolution [15]. In striking contrast, chromatin loops and chromosomal compartments appear to vary during cell differentiation [16] and therefore they presumably play a central role for establishing specific gene expression profiles that determine cell identities. To fully understand how 3D genome organization controls mammalian gene expression, it is thus critical to focus not only on long-range cis-interactions occurring at specific loci within TADs but also on trans-associations occurring between TADs within chromosomal compartments.

Nuclear body assembly by phase separation
Nuclear bodies are large membrane-less ribonucleoprotein (RNP) complexes known to be involved in several nuclear functions. For example, the synthesis of ribosomal RNAs (rRNAs) takes place in the nucleolus, the maturation of small nuclear RNAs (snRNAs) occurs in the Cajal bodies, and the histone messenger RNAs (mRNAs) are transcribed and matured in the histone-locus bodies (HLBs) ( Table 1). One important aspect of functional nuclear compartmentalization is thus related to nuclear bodies. Some of them, like the HLBs, are known to gather loci that are dispersed in TADs located on distinct chromosomes, thus favoring coordinated gene transcription and efficient pre-mRNA maturation [17]. Similarly, the Cajal bodies have also been shown to contain inter-TAD interactions [18]. Transcription factories and active chromatin hubs are also large RNP complexes that have been proposed to coordinate gene expression by maintaining specific genes into a restricted 3D space of the nucleus [19]. Large RNP complexes, including some nuclear bodies, thus appear important for supranucleosomal genome organization in mammals. Indeed, their involvement in regulating transcription of specific genes suggests that they might be critical for the establishment and the maintenance of the active chromosomal compartment. However, the demonstration of such a role has so far been impeded by the lack of a genome-wide method that would allow unbiased profiling of genomic sequences associated with nuclear bodies. In our view, this is due to a continued misunderstanding of the physical nature of nuclear bodies in vivo. It has been thought for a long time [31] that nuclear bodies are self-organized around nucleation sites, e.g. the Nucleolar Organizing Regions -NORs-for the nucleolus or the histone H3-H4 promoter region for Drosophila HLBs [32][33][34]. As a precedent, several cytoplasmic components, like P-granules [35] or centrosomes [36] in C. elegans, have been discovered to behave in vivo like self-organized liquid-like droplets. However, experimental evidence supporting self-organization or self-assembly remained very scarce for nuclear bodies (for reviews see [4,37]). A step forward has been the proposal, based on in-vitro reconstitution experiments, that the phase separation of liquid-like RNP phases could control nucleolus size and assembly [38,39], as well as account for their sub-compartmentalized organization [40]. The demonstration that the Intrinsically Disordered Region (IDR) of Ddx4 protein (a critical component of the mammalian analogue to P-granules) can form phase-separated organelles, both in live cells and in vitro [41], led to the more precise hypothesis that phase separation of IDR-containing proteins could be a general mechanism for forming and regulating membrane-less organelles. These pioneering findings paved the way to a number of studies aimed at deciphering whether phase separation is involved in the organization of other nuclear compartments or bodies. In particular, phase separation was proposed to be involved in heterochromatin domain formation, based on the in-vitro observation that a major component of the heterochromatin, the HP1α protein, can form liquid droplets [42,43]. However, it remains unclear whether, in vivo, heterochromatin domains actually rely on liquid-liquid phase separation (LLPS). In this case, domain formation could involve some weak multivalent chromatin binders, like it has been suggested for the nucleolus [40], or could rely on a different mechanism involving polymer-polymer phase separation (PPPS), where a bridging factor (e.g. the HP1 proteins) induces a collapse of chromatin [44].
Beyond the intrinsic nature of the interacting molecules responsible for phase separation (bridging factors for PPPS vs weak multivalent binders for LLPS), the main differences between these two phase-separation processes lie in the role of the underlying polymer, i.e. the chromatin nucleofilament. In PPPS, the polymer is required not only to nucleate phase separation but also to maintain it [45]. On the contrary, the polymer is only required for nucleation of LLPS, being dispensable to maintain phase separation once a given saturating concentration of the self-associating multivalent chromatin binder has been reached [44].

Phase-separation models for transcription control
Following these discoveries, Phillip Sharp and colleagues proposed a phase-separation model for transcription control, in which a transcriptional condensate would form by phase separation at a given locus following the formation of large RNP complexes induced by the binding of transcription factors at both enhancers and gene promoters [46]. This model was recently reinforced by studies showing that: i) transcriptional coactivators, like BRD4 and the Mediator complex at active super-enhancers, together with the RNA polymerase II at promoters, form transcriptional condensates [47,48], and ii) that transcription factors activate genes through the phase-separation capacity of their activation domains [49]. Transcriptional condensates, however, are relatively small compared to nuclear bodies. Moreover, it is not yet totally clear whether, in this case, phase separation always relies on a liquid-like phase separation similar to the LLPS observed for larger nuclear compartments like the nucleolus, or on a polymer collapse process like in the PPPS case of heterochromatin domains described above. Indeed, on the one hand RNA polymerase II was shown to form clusters or hubs at active genes through interactions between its carboxy-terminal domain (CTD), a prominent IDR, and transcriptional coactivators, strongly suggesting that compartmentalization occurs here through a LLPS process [7]. On the other hand, the transient unspecific binding of RNA polymerase II to the largely nucleosome free genome of the Herpes Simplex Virus type 1 (HSV1) leads to a DNA-mediated nuclear compartmentalization through a mechanism that is clearly distinct from LLPS [45]. Given the relatively small size of these transcriptional condensates, the physical properties that usually characterize the liquid state of a molecular assembly (like surface tension) may well make no real physical and biological sense; that is precisely why the term "liquid-like phase separation" was preferred to LLPS. However, liquid-like phase separation is also based on multiple weak interactions (hydrophobic interactions or electrostatic bonds), in contrast to PPPS involving bridging factors. Once again the role played by the chromatin polymer differentiates one process from the other, as discussed above for heterochromatin domains.
In parallel, another work indicates that various IDR-containing proteins form nuclear condensates that could selectively associate with some chromatin regions by physically retaining targeted genomic loci while excluding non-targeted regions [50]. This chromatin filtering model suggests that targeted condensation of liquid-like droplets could bring distal genomic loci together. However, these experiments use a novel CRISPR-Cas9-based technology (CasDrop) to artificially target chimeric IDR-containing proteins to chosen genomic sequences. It remains to be seen whether endogenous IDR-containing proteins act in a similar way on their natural targets. Additional work has shown the potential involvement of RNA-binding proteins [51].
In Figure 1, we provide an integrated model presenting the current working hypothesis, where we combine the concepts proposed in [46,49,50] for transcriptional condensates involving long-range cis-interactions and extend these concepts to the probable involvement of nuclear bodies favoring inter-TADs trans-associations of co-regulated genes, like those observed for HLBs and Cajal bodies [18]. with the action of RNA processing factors containing motifs prone to multivalent interactions [51], they bring enhancers, promoters and/or nascent RNA transcripts in close vicinity, thus stabilizing long-range cis-interactions and promoting transcription. (d) In some instances, transcriptional condensates containing similar/compatible phase separation-prone motifs could finally further condense into larger nuclear sub-organelles, leading to the formation of nuclear bodies like the Histone Locus Bodies (HLBs). The latter process brings together loci with similar transcriptional regulation but located on distinct TADs/chromosomes (orange/red lines and arrowheads), thus favoring the coordinated expression of the corresponding genes. This integrated model leads to two predictions: First, there should be at least two classes of genes, those that are contacting phase-separated transcriptional condensates and those that are not. Second, there should be at least two classes of membrane-less nuclear compartments, those that are depending (in-vivo) on polymer-polymer phase separation (PPPS) and those that are depending on liquid-liquid or liquid-like phase separation (LLPS).

HRS-seq: a novel method to explore nuclear bodies-associated sequences
Further exploration of these predictions, and more widely of the role of nuclear bodies in genome organization requires, as previously mentioned, an unbiased genome-wide sequencing of nuclear bodies-associated sequences. So far, these sequences have been difficult to analyse because no efficient and reliable method was available to purify nuclear bodies, presumably due to their membrane-less phase-separated nature. We have recently shown that performing a liquid (or liquid-like) to solid phase transition through high-salt treatments of transcriptionally active nuclei makes large RNP complexes, including nuclear bodies, insoluble [52]. This traps the genomic DNA associated with these complexes. The insoluble material is easily purified on a filtration unit. The trapped DNA, that we named the "High-salt Recovered Sequences" (HRS), can then be separated from the rest of the genome by performing a simple restriction digestion and washing out the soluble material ( Figure 2) [53]. The HRS thus remain on the filter unlike the rest of the genomic DNA. High-throughput sequencing of the HRS (HRS-seq) is then performed to obtain a global profiling of nuclear bodies-associated sequences. The two predictions presented in the previous section can thus now be tested in vivo using the HRS-seq method or quantitative PCR analyses of HRS assays (HRS-qPCR) in appropriate cellular models. Indeed, our recent work in mouse embryonic stem cells showed that HRS include sequences associated with nuclear bodies (like the Cajal bodies, the HLBs, the speckles and paraspeckles). Moreover, transcriptional condensates formed around super-enhancers are also retained in our assay [52]. In full agreement with the first prediction, we found that two classes of genes can be defined according to the criterion of their association (or lack thereof) with large high-salt insoluble RNP complexes [52]. Our work showed that HRS-located genes are highly expressed and associated with the active chromosomal compartment and active super-enhancers in a cell-type specific manner, while genes that do not lie in HRS are moderately or weakly expressed. Testing the second prediction will require experimental differentiation of PPPS from LLPS. As explained above, these two modalities of phase separation differ by the nature of the interacting molecules and the role played by the DNA/chromatin nucleofilament. Therefore, PPPS should be sensitive to nuclease treatments that remove DNA while LLPS and liquid-like phase separation should not. Conversely, LLPS and liquid-like phase separation should be sensitive to compounds that disturb hydrophobic interactions, like 1,6 hexanediol [54], unlike PPPS. So far, sensitivity to 1,6 hexanediol has provided the best experimental evidence in favour of the involvement of liquid-like phase-separation processes in the assembly of transcriptional condensates in-vivo [47,48], as well as for other classical nuclear bodies like the paraspeckles [55].
The inactivation of specific nuclear bodies by CRISPR/Cas9 technologies targeting critical components, combined with the HRS-seq approach, should soon allow extensive genomic profiling of sequences associated with specific nuclear bodies. This should lead to a much deeper understanding of how nuclear body-associated sequences affect gene expression during development and cellular differentiation, as well as in pathological situations where nuclear body formation is altered. For instance, in Spinal Muscular Atrophy (SMA), mutations of the survival of motor neuron 1 (SMN1) gene affect Cajal bodies formation and lead to motor neuron death [56].

Discussion
The assembly of membrane-less compartments by phase separation appears to be a powerful mechanism for nuclear compartmentalization that could drive inter-TADs interactions between distant specific genomic loci. Such a compartmentalization could be essential to coordinate complex genomic functions, in particular transcription. At the molecular scale, thermal motion involved in phase-separation processes implies a continuous exchange of molecules between the dense and the dilute phases. Phase separation depends on the local concentration of critical components, like IDR-containing proteins, and can thus be controlled by regulating their availability. This could be achieved by simple post-translational modifications that affect the protein's ability to establish multivalent interactions, like phosphorylation [57]. Supporting this, PRKACB (catalytic subunit of PKA cAMP-dependent protein kinase) and HIP kinases are required for the assembly of the Cajal and PML bodies, respectively [58]. However, little is known about nuclear body homeostasis, which this certainly constitutes a promising topic for future investigations.
Liquid-liquid phase separation is not a feature involving an isolated molecular species but is depending on the properties of both this species and the solvent. In the case of nuclear bodies, the solvent corresponds to the complex nuclear environment in which the molecular species of interest is considered. A modification of the contents of this environment would thus affect phase separation. Obviously, a structural or chemical modification of the phase separation-prone molecular species would also affect spatial structuration. Various means of tuning the physical process of phase separation are thus possible within a living cell. While in-vitro experiments usually monitor physical parameters controlling phase separation (like temperature or pH), in vivo a specific adaptation of the relative strength of the molecular interactions, through some post-translational modification of the phase separation-prone protein, would offer a more precise control of the process.
The current thermodynamic description of phase separation processes is only valid on a large scale (i.e. involving large enough number of molecules). Thus, a description in terms of macroscopic variables (such as concentrations) makes sense. The intrinsic stochasticity present at the molecular scale is not addressed in the thermodynamic description. The robustness of a thermodynamically predicted phase separation with respect to this molecular noise remains to be investigated. Thermodynamic description would thus potentially be replaced/supplemented with a description in terms of stochastic dynamics. vivo. An example is the observation of droplet fission [59] that is not accounted for in the current thermodynamic models of phase separation. Investigating active features of intracellular dynamic organization thus opens a fascinating research field not only for biologists but also for theoretical physicists. Phase separation is actually a special instance of the more general concept of self-organization, in which a long-range spatial structuring emerges from short-range interactions and breaks the symmetry of the homogenous state. The mechanisms underlying self-organization range from self-assembly of equilibrium complexes to out-of-equilibrium formation of dissipative structures [60,61]. It is thus plausible that a variety of different mechanisms could be involved inside the cell.

Conclusion
The physical notion of phase separation opens novel research avenues in the field of transcriptional gene regulation by suggesting a possible interplay between assembly of nuclear bodies and recruitment of specific genomic sequences. However, it remains to be determined to what extent such interplay is dependent on phase separation, or on more complex active and/or specific processes. Here, HRS-seq can be instrumental for dissecting the relationship between 3D chromatin organization and RNP phase separation, and its implication for both cis-and transco-regulation of gene expression. Understanding the relevance of phase separation in a biological context will require theoretical studies devising microscopic descriptions accounting for the intrinsic fluctuations present at the intracellular scale, as well as experimental studies investigating the possible involvement of active mechanisms.
Author Contributions: All authors contributed to the conception and writing of this manuscript.