Transcription at a Distance in the Budding Yeast Saccharomyces cerevisiae

: Proper transcriptional regulation depends on the collaboration of multiple layers of control simultaneously. Cells tightly balance cellular resources and integrate various signaling inputs to maintain homeostasis during growth, development and stressors, among other signals. Many eukaryotes, including the budding yeast Saccharomyces cerevisiae , exhibit a non-random distribution of functionally related genes throughout their genomes. This arrangement coordinates the transcription of genes that are found in clusters, and can occur over long distances. In this work, we review the current literature pertaining to gene regulation at a distance in


Overview and Background
Transcription is the production of an RNA intermediary that links the genetic information stored in the nucleus to a specific phenotype, as outlined in the 'Central Dogma of Molecular Biology' [1,2]. In all cells, transcriptional regulation is essential for the maintenance of homeostasis and adaptation to a changing environment, allowing the cell to maintain an equilibrium within the niche that it occupies. In the case of single celled organisms, transcription is balanced with the intracellular and extracellular signaling cues received, allowing coordination of growth with adaptation to stressors [3][4][5]. Proper gene expression is required for health, survival, adaptation, and development [6,7].
In all organisms, myriad layers of transcriptional regulation collaborate to modulate the transcriptome. The loss or dysfunction of regulation in even a single layer of this balance can result in quite severe cellular disorders, disease states, or even death. Canonical mechanisms that collaborate to regulate gene expression include regulatory nucleotide sequences as well as regulatory DNA binding proteins [8][9][10]. Overlaid with that are epigenetic mechanisms, one example of which is lysine acetylation, which is required for normal development in evolutionary divergent eukaryotes (and is the subject of several excellent reviews) [11][12][13][14]. Abnormal lysine acetylation has long been recognized as a characteristic of diseases, including cancers [15][16][17]. In addition to these mechanisms, there are additional layers of regulation in many species, including modification of DNA nucleotides, regulation of transcription by microRNAs, and RNA turnover and degradation that collaborate to coordinate mRNA abundance, spatial positioning in both two-dimensions and three-dimensions (within the nucleoplasm), among others [18][19][20][21].
The focus of this work is to review recent advances in the literature surrounding transcriptional regulation at a distance, using the budding yeast Saccharomyces cerevisiae as the model system. The budding yeast, S. cerevisiae, is an exceptional model system for molecular and genetic studies, and lends itself to insights in other eukaryotes [22][23][24]. While budding yeast has its own species-specific quirks, there is extensive conservation on a genetic level to humans [25,26]. Recent work has revealed valuable insights into the chromosomal distance constraints in place that limits transcriptional activation and repression across broad genomic regions.

Overview of Transcriptional Regulation in the Budding Yeast, Saccharomyces cerevisiae
One fundamental layer of transcriptional regulation is local cis regulatory nucleotide sequences that include promoters, enhancers, upstream activating sequences (UAS), and upstream repressive sequences (URS). The promoter sequence is what directly interacts with the RNA polymerase to form the pre-initiation complex (PIC) [27]. The formation of the PIC is stabilized by the UAS to increase transcription (and conversely, the URS inhibit and destabilize the PIC formation) [27]. S. cerevisiae contains a compact genome, and regulatory sequences are frequently in close proximity to the open reading frame (ORF) for a gene of about 300 base pairs, on average [28,29]. Promoters largely fall into two distinct families: those that are constitutively expressed under enriched nutrient growth (about 55% of promoters) and those that are induced upon specific conditions (about 45% of promoters) [30].
Examples of constitutively active promoters include those for genes that are necessary for ribosome biogenesis, including NOP12-which are regulated by an UAS for Abf1p and URSs such as the polymerase A and C (PAC) and ribosomal RNA processing element (RRPE) (Figure 1). These sequences balance the production of the 200+ genes that are components of the ribosome biogenesis (Ribi) regulon [31]. During periods of rapid growth and division, the Ribi genes are upregulated and highly expressed to meet cellular demands, but during stress they are rapidly downregulated as cellular resources are diverted elsewhere [3]. Conversely, there are inducible promoters including the GAL1 and CUP1 promoters. Transcription of genes that are associated with these promoters is typically repressed during rapid growth, but they are activated by the presence of galactose and copper, respectively, within the growth environment [32].

cerevisiae
One fundamental layer of transcriptional regulation is local cis regulato sequences that include promoters, enhancers, upstream activating sequence upstream repressive sequences (URS). The promoter sequence is what dire with the RNA polymerase to form the pre-initiation complex (PIC) [27]. The the PIC is stabilized by the UAS to increase transcription (and conversely, th and destabilize the PIC formation) [27]. S. cerevisiae contains a compact geno ulatory sequences are frequently in close proximity to the open reading fra a gene of about 300 base pairs, on average [28,29]. Promoters largely fall into families: those that are constitutively expressed under enriched nutrient g 55% of promoters) and those that are induced upon specific conditions (abou moters) [30].
Examples of constitutively active promoters include those for genes th sary for ribosome biogenesis, including NOP12-which are regulated by Abf1p and URSs such as the polymerase A and C (PAC) and ribosomal RN element (RRPE) (Figure 1). These sequences balance the production of the 20 are components of the ribosome biogenesis (Ribi) regulon [31]. During per growth and division, the Ribi genes are upregulated and highly expressed to demands, but during stress they are rapidly downregulated as cellular reso verted elsewhere [3]. Conversely, there are inducible promoters including t CUP1 promoters. Transcription of genes that are associated with these prom cally repressed during rapid growth, but they are activated by the presence and copper, respectively, within the growth environment [32].  The proximal regulatory sequences oftentimes are a binding site for trans acting transcription factors (TFs) that can alter the recruitment of RNA polymerases. There are roughly 270 verified and predicted TFs in the budding yeast [33,34]. TFs work in collaboration with one another for PIC formation, as seen in the ribosomal biogenesis transcription factors Abf1p, Stb3p, Tod6p, and Dot6p, and, they function to maintain stoichiometric levels of expression of the Ribi genes during ribosome biogenesis ( Figure 1A) [35][36][37].
The establishment of differential chromatin states modulates accessibility of the cis and trans factors within a spatial and temporal context. The presence of nucleosomes can inhibit transcription, and most transcribed genes have a 150-200bp nucleosome-free region (NFR, also called a nucleosome-depleted region, or NDR) upstream of the ORF [38][39][40][41]. Epigenetic markers of active chromatin, including acetylated histones, are found at the 5 end of actively expressed genes [42]. Induction of the stress response in budding yeast results in the upregulation of genes to adapt to a stressor, such as the heat shock proteins that act to maintain proteostasis and to modulate tRNA abundance to regulate transcription [5,43]. TF binding can alter nucleosome dynamics from the promoter region of corresponding genes and favors PIC formation [40,44].
The spatial arrangement and positioning of genes along the chromosome contribute to the absolute levels of expression due to position effects within a genomic locus. These effects were initially characterized based on proximity to heterochromatin, including that found at the telomeres [45]. These positional effects are not limited to the proximity of heterochromatin, but are prevalent throughout the genome as well [46]. Such position effects can result in transcriptional regulation at a distance, as is seen in adjacent gene co-regulation, a phenomenon that links transcription of functionally clustered genes via shared regulatory mechanisms [47,48]. This phenomenon results in the clustering of genes whose transcripts are required in roughly equivalent stoichiometric levels by the cell, as seen in shared biosynthetic pathways and protein complexes [49].

Transcriptional Interference and Gene Repression at a Distance
Gene proximity can influence the transcription of neighboring genes via transcriptional interference, as seen in the SRG1-SER3 locus ( Figure 1B) [50]. The SRG1 transcript is a non-coding RNA species that represses the expression of SER3 when transcribed [51]. The spatial arrangement of these two genes in a tandem orientation (→→) results in intergenic transcription of the SRG1 locus into the regulatory region of SER3 [51]. This overlap of transcription results in a repression of SER3 as a part of a serine responsive transcriptional circuit [50,51]. Transcriptional interference is a potent regulator of gene expression, and thus can favor genome organization to allow mutually exclusive transcription patterns [52].
Proximity of a regulatory element to a gene correlates with expressional regulation. The closer that a promoter, enhancer, or regulatory sequence is located to a gene, the greater the influence of the regulatory sequence on the transcription of the neighboring gene(s) [53]. Simply separating a regulatory sequence from a gene with an increasing spacer size causes a decrease in the resulting expression of a reporter gene [53,54]. Activation drops off to nil at a distance of approximately 600 base pairs of distance between a regulatory element and a gene.
Interestingly, the Mediator complex imposes one of the distance constraints to limit transcriptional activation at a distance [53]. The Mediator protein complex is a multiple subunit complex that associates with transcriptional activators and components of the PIC to help modulate transcription [55]. The S. cerevisiae Mediator complex has three distinct domains (head, middle, and tail) and is comprised of 21 subunits [56]. One such subunit is Sin4p, which plays a role in UAS-core promoter specificity by means of encoding a subunit of the tail domain of the Srb/mediator coactivator complex. SIN4 null mutants display an ability to activate transcription at a distance of up to two kilobases away [53]. Thus, the Mediator coactivator complex limits long-distance activation under normal conditions. The Mediator components Sin4p, Rgr1p, and Cdk8p are responsible for repression of long-distance transcription, and are dependent on the Med2p and Med3p [57]. In a sin4 null background, the Mediator tail components can be recruited independently of the rest of Mediator [57].
An elegant genetic screen for polygenic mutants that can transcribe at greater distances found causative mutations in MOT3, GRR1, MIT1, MSN2, and PTR3 that allow for long-distance activation at distances that are otherwise impermissible for transcriptional activation [57]. These isolated polygenic mutants transcribe effectively at distances outside of the range of the wild type, however they cannot activate transcription as efficiently at the distance within the range typical of a wild type promoter element. Consistent with these observations, the authors reason that there are multiple factors that regulate activation (or repression) at a distance and that for other, larger eukaryotes, the regulation of long-distance activation may be coordinated by multiple additional factors [57].

Gene Activation at a Distance
Budding yeast has a compact genome for a eukaryote, so it is important that activation occurs only over a short distance to limit activation accordingly [58,59]. In S. cerevisiae, UASs are typically found within 450 base pairs of the TSS, whereas in metazoans with larger genomes, enhancers can be located at greater distances and are often located several kilobases away [60]. Gene proximity can influence transcription throughout a chromosomal region. This results in 'pockets' of correlated gene expression genome-wide [61]. This likely occurs via the activation of genes due to promiscuous promoter and enhancer elements that exert activation at distance, oftentimes to genes that are located at a distance [62]. In budding yeast, this distance constraint has been characterized, with a global activation distance of roughly one kilobase of distance-although there is extensive variance that is present depending on the genomic locus queried [62].
The orientation of genes is important for transcriptional regulation. In simple prokaryotic organisms, functionally related genes are often clustered to allow for polycistronic transcription and regulation of a gene family [63,64]. Operons are not a characteristic of most eukaryotes, with the characterized exception of C. elegans, which contains clustered genes that are transcribed as a polycistronic mRNA species [65,66]. This orientation represents an efficient manner to co-regulate multiple genes simultaneously.
One feature of yeasts, including S. cerevisiae, is the prevalence of extensive clustering of functionally related genes as neighbors throughout the genome [49,67,68]. This clustering is present in a vast number of gene families whose protein products are components in the same metabolic pathways, and has been extensively characterized in the ribosomal protein (RP) and ribosome biogenesis (Ribi) families [48]. This clustering of the RP and Ribi gene families are extensively conserved throughout eukaryotes that are evolutionarily divergent [69].
The orientation of co-expressed, clustered genes likely facilitates the mechanism underlying expression regulation. Clusters can be found in divergent (← →), tandem (→→ and ←←), or convergent orientations (→ ←). Divergent promoters activate multiple genes simultaneously, such as a the shared GAL1-GAL10 promoter ( Figure 1C) [70]. Many functionally clustered genes in yeasts are oriented in a divergent manner, allowing for a shared bidirectional promoter [67,68,71]. Many yeast promoters have been characterized as being bidirectional in nature and can function regardless of orientation relative to a gene [72,73]. The prevalence of bidirectional promoters results in pervasive 'cryptic' transcription in yeast, which is normally limited at select loci by the activity of Rap1p [74,75].
Tandem and convergent orientations also can help to modulate transcription of functionally clustered genes. When orientated in a tandem orientation, there is the possibility of a single mRNA intermediary that contains coding information for both genes. A recent analysis of the entire S. cerevisiae has found that a small, but significant, fraction of the genome is transcribed in a bicistronic manner, such as the RTC4-GIS2 locus ( Figure 1D) [76]. As a mechanism, the prevalence of bicistronic transcripts represents approximately 10% of the genes in the genome [76]. A convergent arrangement lends itself to co-expression via mechanisms that may include chromatin remodeling or long range looping interactions within the nucleoplasm [54].

Conclusions
Regulation of gene expression throughout a genomic region has important implications for our understanding of gene functions and biotechnological applications. A paucity of data pertaining to this phenomenon has led to a missed annotation of gene functions due to transcriptional disruption across a genomic region. Representative examples include the attribution of a genetic interaction between CDC50 and PAN2, rather than the bona fide interaction between PAN2 and CDC39, which is neighbors with CDC50 [77,78]. Such effects are especially important for geneticists exploring gene functions, which frequently employ reporter genes that may disrupt the transcriptional patterns throughout a region via the neighboring gene effect. Likewise, researchers working to engineer or manipulate specific metabolic pathways for pharmaceutical and industrial uses should take heed-the choice of location can have unintended secondary effects, depending on the locus chosen for manipulation [46].