Mechanisms and Functions of the RNA Polymerase II General Transcription Machinery during the Transcription Cycle

Central to the development and survival of all organisms is the regulation of gene expression, which begins with the process of transcription catalyzed by RNA polymerases. During transcription of protein-coding genes, the general transcription factors (GTFs) work alongside RNA polymerase II (Pol II) to assemble the preinitiation complex at the transcription start site, open the promoter DNA, initiate synthesis of the nascent messenger RNA, transition to productive elongation, and ultimately terminate transcription. Through these different stages of transcription, Pol II is dynamically phosphorylated at the C-terminal tail of its largest subunit, serving as a control mechanism for Pol II elongation and a signaling/binding platform for co-transcriptional factors. The large number of core protein factors participating in the fundamental steps of transcription add dense layers of regulation that contribute to the complexity of temporal and spatial control of gene expression within any given cell type. The Pol II transcription system is highly conserved across different levels of eukaryotes; however, most of the information here will focus on the human Pol II system. This review walks through various stages of transcription, from preinitiation complex assembly to termination, highlighting the functions and mechanisms of the core machinery that participates in each stage.


Introduction
Controlling gene expression is essential to normal growth, development, and sustained life.In metazoans, this requires regulating the spatial, temporal, and developmental expression of genes in a wide diversity of cell types.Mis-regulation of gene expression contributes to most disease states.The main control point for regulating gene expression is at the level of transcription.In eukaryotic cells, RNA polymerase II (Pol II) transcribes protein-coding genes into messenger RNA (mRNA) transcripts.Pol II also synthesizes long non-coding RNA (lncRNA) and most small nuclear RNA (snRNA) and microRNA (miRNA).Pol II transcription is vital for cell proliferation, proper expression of metabolic enzymes, signaling, cell fate, differentiation, gene expression, and nearly every cellular process.Although the RNA polymerase II transcription system is highly conserved across eukaryotes, this review is primarily focused on the human system, with some references to data from Drosophila and yeast systems.
The Pol II core enzyme can itself synthesize RNA using a template DNA, but promoterspecific transcription initiation requires the canonical general transcription factors (GTFs): TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH (Table 1).In addition, the large multi-subunit complex Mediator is essential for proper transcription in cells [1,2].Chromatin remodeling/modifying complexes and additional co-regulatory factors function together with promoter-specific transcriptional activators and repressors to set the proper level and timing of transcription from individual genes in specific cell types.This review serves as an

RNA Polymerase II
Pol II is a large ~500 kDa complex made up of 12 protein subunits, named Rpb1-12.Studies have shown that 10 of the 12 subunits form the catalytic core of the Pol II complex and are either identical (Rbp5, 6,8,10,12) or highly similar (Rpb1-3, 9, 11) to subunits found in RNA polymerase I and RNA polymerase III, which transcribe primarily tRNAs and rRNAs, respectively [9,10].Recent ChIP-seq (chromatin immunoprecipitation followed by high-throughput sequencing) and mass spectrometry studies have shown that different sets of Rpb subunits differentially regulate select subsets of human genes, demonstrating the dense layers of regulation within the Pol II complex itself [11].Crystal structures of yeast Pol II, and more recently cryo-EM structures of human and yeast Pol II, have revealed that the Pol II complex can be divided into the core, shelf, jaw lobe, and clamp structural domains that interact with each other and undergo conformational changes during the stages of transcription (these structures have been extensively reviewed in the literature [12][13][14][15]).The core domain contains Rpb3 and Rpb10-12 as well as the positively charged active center cleft, formed by Rpb1 and Rpb2 [16,17].The active site of Pol II is buried deep at the base of the active center cleft, thus requiring translocation of the template DNA strand to the active site after entering the cleft.The shelf and jaw lobe elements have little observed movement but can rotate parallel to the active center cleft [17].The clamp domain is connected to the active site cleft in the core domain through an array of flexible switches, and it swings nearly 30 Å upon opening or closing the cleft [9,17].While not considered part of the catalytic core of Pol II, binding of the Rpb4 and Rpb7 subunits has been shown to be vital for maintaining the closed conformation of the Pol II clamp over the DNA during initiation [18,19].It is hypothesized that the closing of this clamp domain over the cleft coupled with DNA distortion may facilitate promoter melting [20].
At 250 kDa, Rpb1 is the largest of all Pol II subunits and the principle catalytic subunit of the Pol II complex [21].Beyond its catalytic role, Rbp1 plays a regulatory role in the transcription cycle that is mediated by the unstructured C-terminal domain (CTD) on the Rpb1 subunit.The CTD consists of a long tail comprising heptapeptide repeats of the consensus sequence YSPTSPS, with minor variability at the Ser7 position in repeats near the C-terminus [22,23].Mammalian Pol II contains 52 repeats, with the number of repeats varying among different organisms in a manner that loosely correlates with genomic complexity [24].The YSPTSPS consensus sequence is well conserved across eukaryotes, emphasizing the functional importance of each residue [24].The CTD tail is not necessary for basal (i.e., unregulated) Pol II transcription in vitro [25,26]; however, it is required for accurate Pol II transcription and proper termination in cells [27][28][29].The CTD is thought to function as a binding platform for association of numerous other protein complexes that help regulate co-transcriptional processes or steps in transcription, including RNA splicing and transcription termination [30][31][32].
The residues within the heptapeptide repeat are substrates for many post-translational modifications, with phosphorylation being the most well characterized.The Tyr, Ser, and Thr residues can be reversibly phosphorylated/dephosphorylated, allowing for regulation of Pol II activity through the transcription reaction and of Pol II CTD affinity for various regulatory factors [22,23,30] (Figure 1).For example, the level of specific phosphorylation marks varies across different stages of transcription, depending on the purpose of the modification.The use of phospho-specific antibodies coupled to ChIP-seq, in addition to in vitro work, have enhanced understanding of how CTD phosphorylation patterns change throughout the transcription cycle.Pol II is recruited to preinitiation complexes on promoter DNA in a hypo-phosphorylated form.Phosphorylation of Ser5 by the CDK7 kinase subunit of TFIIH (which is one of the general transcription factors discussed below) facilitates initiation of transcription.The Ser5 mark is removed as Pol II moves throughout the gene body.As a counterpoint, Ser2 phosphorylation predominantly accumulates after initiation to help recruit elongation and RNA processing factors and peaks at the 3' ends of genes where it is thought to facilitate termination [33][34][35][36].Beyond Ser5 and Ser2, other sites of phosphorylation on the Pol II CTD include Tyr1, Thr4, and Ser7, which have not been studied in the same detail as the other CTD residues.Research has shown that the significance of these three residues can vary across species, with metazoan and yeast systems sometimes exhibiting different behaviors [37,38].Exploring the function of these other important Pol II CTD residues provides numerous areas for future study.
Biomolecules 2024, 14, x FOR PEER REVIEW 3 of 25 this clamp domain over the cleft coupled with DNA distortion may facilitate promoter melting [20].At 250 kDa, Rpb1 is the largest of all Pol II subunits and the principle catalytic subunit of the Pol II complex [21].Beyond its catalytic role, Rbp1 plays a regulatory role in the transcription cycle that is mediated by the unstructured C-terminal domain (CTD) on the Rpb1 subunit.The CTD consists of a long tail comprising heptapeptide repeats of the consensus sequence YSPTSPS, with minor variability at the Ser7 position in repeats near the C-terminus [22,23].Mammalian Pol II contains 52 repeats, with the number of repeats varying among different organisms in a manner that loosely correlates with genomic complexity [24].The YSPTSPS consensus sequence is well conserved across eukaryotes, emphasizing the functional importance of each residue [24].The CTD tail is not necessary for basal (i.e., unregulated) Pol II transcription in vitro [25,26]; however, it is required for accurate Pol II transcription and proper termination in cells [27][28][29].The CTD is thought to function as a binding platform for association of numerous other protein complexes that help regulate co-transcriptional processes or steps in transcription, including RNA splicing and transcription termination [30][31][32].
The residues within the heptapeptide repeat are substrates for many post-translational modifications, with phosphorylation being the most well characterized.The Tyr, Ser, and Thr residues can be reversibly phosphorylated/dephosphorylated, allowing for regulation of Pol II activity through the transcription reaction and of Pol II CTD affinity for various regulatory factors [22,23,30] (Figure 1).For example, the level of specific phosphorylation marks varies across different stages of transcription, depending on the purpose of the modification.The use of phospho-specific antibodies coupled to ChIP-seq, in addition to in vitro work, have enhanced understanding of how CTD phosphorylation patterns change throughout the transcription cycle.Pol II is recruited to preinitiation complexes on promoter DNA in a hypo-phosphorylated form.Phosphorylation of Ser5 by the CDK7 kinase subunit of TFIIH (which is one of the general transcription factors discussed below) facilitates initiation of transcription.The Ser5 mark is removed as Pol II moves throughout the gene body.As a counterpoint, Ser2 phosphorylation predominantly accumulates after initiation to help recruit elongation and RNA processing factors and peaks at the 3' ends of genes where it is thought to facilitate termination [33][34][35][36].Beyond Ser5 and Ser2, other sites of phosphorylation on the Pol II CTD include Tyr1, Thr4, and Ser7, which have not been studied in the same detail as the other CTD residues.Research has shown that the significance of these three residues can vary across species, with metazoan and yeast systems sometimes exhibiting different behaviors [37,38].Exploring the function of these other important Pol II CTD residues provides numerous areas for future study.

PIC Assembly Mechanisms
Eukaryotic Pol II does not have sequence-specific DNA binding capacity.Therefore, it relies on an array of GTFs to properly position Pol II at the core promoter region of genes, which contains the transcriptional start site (TSS).The GTFs assemble along with Pol II into a PIC in a tightly regulated process.Decades of research using in vitro biochemical assays, cellular systems, and structural approaches have provided a wealth of fundamental insight into how PIC formation occurs; however, the precise mechanisms and pathways are not yet fully defined.In general, there are two distinct, but not mutually exclusive, models for how PIC assembly occurs (Figure 2) [39,40].In the stepwise assembly model, the GTFs assemble on the promoter DNA in a sequential order: TFIID/TFIIA, TFIIB, Pol II/TFIIF, TFIIE, then TFIIH.This model was initially developed from in vitro biochemical studies that recombined the purified GTFs in different orders, monitoring complex assembly and transcriptional activity [41,42].In the holoenzyme model of assembly, Pol II and many of the GTFs pre-assemble off of the promoter DNA and bind as a unit to the core promoter.Evidence supporting the holoenzyme model of assembly arose from experiments showing that Pol II co-purifies in a complex with various subsets of GTFs in the absence of DNA [43].While the subset of GTFs isolated as part of the Pol II holoenzyme varies within the literature, one study found that TFIID and TFIIA were absent from the holoenzyme [44], suggesting that TFIID and TFIIA first bind the promoter, then recruit the Pol II/GTFs holoenzyme to form a complete PIC.It is likely that both models of assembly occur in cells, and assembly is regulated to enable specific transcriptional responses at different promoters in response to unique cellular stimuli [40].

PIC Assembly Mechanisms
Eukaryotic Pol II does not have sequence-specific DNA binding capacity.Therefore, it relies on an array of GTFs to properly position Pol II at the core promoter region of genes, which contains the transcriptional start site (TSS).The GTFs assemble along with Pol II into a PIC in a tightly regulated process.Decades of research using in vitro biochemical assays, cellular systems, and structural approaches have provided a wealth of fundamental insight into how PIC formation occurs; however, the precise mechanisms and pathways are not yet fully defined.In general, there are two distinct, but not mutually exclusive, models for how PIC assembly occurs (Figure 2) [39,40].In the stepwise assembly model, the GTFs assemble on the promoter DNA in a sequential order: TFIID/TFIIA, TFIIB, Pol II/TFIIF, TFIIE, then TFIIH.This model was initially developed from in vitro biochemical studies that recombined the purified GTFs in different orders, monitoring complex assembly and transcriptional activity [41,42].In the holoenzyme model of assembly, Pol II and many of the GTFs pre-assemble off of the promoter DNA and bind as a unit to the core promoter.Evidence supporting the holoenzyme model of assembly arose from experiments showing that Pol II co-purifies in a complex with various subsets of GTFs in the absence of DNA [43].While the subset of GTFs isolated as part of the Pol II holoenzyme varies within the literature, one study found that TFIID and TFIIA were absent from the holoenzyme [44], suggesting that TFIID and TFIIA first bind the promoter, then recruit the Pol II/GTFs holoenzyme to form a complete PIC.It is likely that both models of assembly occur in cells, and assembly is regulated to enable specific transcriptional responses at different promoters in response to unique cellular stimuli [40].Recent advances in our understanding of PICs come from the advent of new technologies such as cryo-EM and single-molecule imaging.A profusion of cryo-EM studies in recent years has provided detailed pictures of the architecture of PICs and other transcription complexes, many of which are referred to in other sections of this review.Importantly, advances in this technique allow multiple conformations of complexes to be resolved to provide insight into protein flexibility.Due to the large size and number of individual protein subunits that comprise PICs, capturing this flexibility informs how assembly and early steps in transcription are facilitated by specific protein-protein interactions.This body of structural work has been extensively reviewed elsewhere [12][13][14][15]45,46].
Recent advances in single-molecule imaging in vitro and single-particle tracking in live cells have also advanced our understanding of PIC formation and general transcription machinery.These studies visualize fluorescently labeled protein factors in real-time with millisecond resolution to provide a dynamic view of how factors interact with each other and the genome as well as heterogeneity in their behavior.Below we describe examples of how these findings have advanced our understanding of transcriptional control, focusing on general transcription machinery.Although outside the scope of this review, single-molecule imaging studies are also breaking new ground in understanding mechanisms of transcriptional regulation and how transcriptional activators function (for reviews, see [47][48][49][50]).For example, real-time imaging has shown that, in general, transcription activator binding is very dynamic with brief residence times on chromatin and different kinetic populations that likely reflect functional and non-functional interactions (for examples, see [51][52][53][54][55]).The imaging approaches and findings surrounding transcriptional activators will be informative for ongoing single-molecule studies of the general transcription machinery.
The ability of single-molecule imaging to resolve heterogeneity and rapid dynamics has provided new insight into behaviors of the GTFs and Pol II.For example, one study using reconstituted human PICs found that TFIIB binding in the PIC is highly dynamic and only becomes stably bound after Pol II/TFIIF is recruited to the PIC [56].Another in vitro work revealed that the release of TFIIB after synthesis of a 7-or 9-nucleotide RNA is tightly coupled to maintaining the activity of complexes [57].Work using yeast nuclear extracts with fluorescently labeled factors showed that Pol II can assemble with TFIIE and TFIIF on an upstream activation sequence before being transferred to the core promoter with the other GTFs [58,59].Single-molecule tracking of PIC components in live yeast cells found that the GTFs will sub-diffuse through a small space in the nucleus, confined by the large complexes of Mediator and TFIID [59,60].Imaging Pol II itself in human cells has informed our understanding of how the polymerase tracks through the stages of transcription [61,62].Measuring the turnover rate of different Pol II populations near promoters showed Pol II molecules freely diffusing, bound to chromatin, paused at the promoter, or productively elongating [63].
Imaging Pol II and associated factors in human cells has also detailed spatiotemporal regulation of transcription through transcriptional bursting and Pol II clustering.Transcription is discontinuous over time, giving rise to fluctuations in activity known as bursting, which is supported by clusters of Pol II that form highly dynamic foci [64,65].Studies show the Mediator complex can load several Pol II enzymes onto the promoter region to form a convoy of polymerases that initiate and enter productive elongation [66].Moreover, data show that dynamic clusters of Pol II, Mediator, and other cofactors can form condensates [67,68].These spatiotemporal phenomena appear to form the basis of a rapid response transcription system activated by cellular stimuli.Indeed, recent work shows that transcriptional condensates can form at enhancers to amplify the magnitude and frequency of transcriptional bursting when in proximity to a gene [69].These important mechanisms of Pol II function and regulation are reviewed in detail elsewhere [64,70,71].

TFIID and the Core Promoter
PICs are nucleated by the general transcription factor IID (TFIID) bound to the promoter DNA.A recent comprehensive study of active promoters in human cells concluded all promoters are organized around the purpose of serving as a TFIID binding site [72].TFIID is a large 14-subunit complex composed of the 37 kDa TATA-binding proteins (TBPs) and 13 TBP-associated factors (TAFs) ranging in size from 15-250 kDa [39].The TBP subunit specifically binds to the minor groove of the TATA box, a core promoter DNA element located upstream of the TSS with an AT-rich consensus sequence that has been defined in vitro and more recently in human cells [73,74].Upon binding, TBP induces a sharp ~90 • bend in the DNA via phenylalanine residues intercalating between the DNA base pairs near each end of the TATA box [75,76].This sharp bend helps position the DNA with respect to Pol II, which may explain why TBP or a TBP homolog is required for basal transcription at genes transcribed by all three RNA polymerases [77].Interestingly, TBP is able to bend DNA when bound to promoters with or without a TATA box, highlighting the importance of this behavior for transcriptional activity [78].Structural data suggest that DNA bending repositions factors at the promoter in a conformation that enables contacts that are not possible on a linear DNA conformation, and it helps to modulate the auto-repressive nature of some of the TAF subunits on TBP [13].However, the large majority of human genes lack a TATA box near their transcriptional start sites [79][80][81][82], including many housekeeping genes and genes encoding growth factors, transcription factors, and oncoproteins [83].A recent study utilizing rapid protein depletion followed by PRO-seq found that TBP is dispensable for driving transcriptional activity at promoters that do not contain a TBP-binding motif, suggesting the importance of other factors for promoter recognition [74].
Without a TATA box, promoter recognition must occur via interactions mediated by the TAF subunits of TFIID.The TAFs recognize other promoter elements, most notably the initiator (Inr) element [84].The Inr surrounds the TSS and is sufficient for initiation in the absence of other promoter elements [82,85].Inr is thought to be present at most human promoters outside of ribosomal protein genes [72].Sequences downstream of the TSS are also important for promoter recognition by TFIID.Several downstream promoter elements have been characterized in Drosophila systems, but their presence in human systems is poorly understood.However, studies that incorporated computational modeling, sequencing techniques, and functional studies have found evidence for active downstream promoter motifs in human systems [86,87].Moreover, evidence of multiple TAFs binding downstream of the TSS has been found using crosslinking [88], cryo-EM [89], ChIP-exo experiments, and in vitro transcription assays followed by quantitative mass spectrometry [90].Structural studies suggest that the TAF subunits may change conformations, or even dissociate, after depositing TBP at the promoter due to predicted steric clashes with GTFs and Pol II that subsequently assemble into the PIC [91].TFIID, and particularly TAF1, is also shown to be involved in promoter proximal pausing and positioning of the +1 nucleosome downstream of the TSS [72,92].The binding and proper alignment of TFIID on the core promoter is critically important for PIC nucleation and regulation of early transcription [93].
Our understanding of core promoter complexity has evolved in recent years due in large part to information gained from diverse applications of high-throughput sequencing approaches.Active core promoters exhibit different initiation patterns defined in part by TSS selection.Genes that initiate transcription over a broad region from multiple transcription start sites exhibit "dispersed/broad" initiation, whereas genes following the canonical structure of a single predominant TSS have "focused/sharp" initiation [84,94].Focused genes are typically those that are tightly regulated or cell type-specific, whereas dispersed genes are often broadly expressed across many cell types, such as housekeeping genes.It is believed that dispersed promoters in human cells function by allowing for the assembly of multiple PICs with defined TSSs that together define a transcriptional start region, in contrast to yeast where a single PIC may drive initiation from multiple TSSs [72].These two broad promoter classes have many important differences, including arrangement/presence of core promoter motifs, placement of the +1 nucleosome downstream of the TSS, behavior of promoter proximal pausing, and associated coactivators, among others [95,96].Another added layer of complexity present at core promoter regions is bidirectional or divergent transcription, in which two separate Pol II complexes initiate transcription in opposite directions [97,98].At protein coding genes, this process produces the mRNA plus an unstable upstream noncoding RNA (ncRNA) in the antisense orientation [99][100][101].Enhancers, which are regions of DNA that contain binding sites for sequence-specific transcriptional activators/repressors, are also sites of bidirectional transcription.Signals of bidirectional transcription at enhancers indicate their active influence in gene expression [102,103].This process generates enhancer RNAs (eRNAs) that can influence gene expression [97].The regulatory network between promoters, Pol II, TFIID, and other general factors continues to provide a rich field for new discoveries.

TFIIA
The main role of TFIIA during transcription is the stabilization of the TFIID-DNA interaction.TFIIA is a heterotrimer consisting of α, β, and γ subunits with masses of 35, 19, and 12 kDa, respectively, and binds just upstream of the TATA box.TFIIA makes direct contacts with TFIID, TBP, TFIIE, and TFIIF within the PIC [89,104].TFIIA also binds to several activators and repressors, and it can enhance the effects of co-activators [105].TFIIA binding to activators and TFIID has been proposed to hasten TFIID recognition of the promoter DNA, a typically rate-limiting step [106].TFIIA has also been shown to increase TBP affinity for the TATA box [42,107], especially in conditions where TBP binding to promoter DNA is suboptimal [108].Free subunits of TBP or TFIID can form homodimers in solution, likely as a way to regulate the rate of promoter recognition and PIC assembly; TFIIA can facilitate dissociation of TBP and/or TFIID homodimers, thereby accelerating promoter recognition [109].TFIIA can be cleaved by Taspase1, and both cleaved and uncleaved forms can exist in cells.Cleavage is not necessary for activity, but it does impact turnover rate of TFIIA in cells, which may be a source of regulation [105].The cleaved and uncleaved forms have different affinities for TBP and can form unique subcomplexes, which also have different affinities for promoter DNA and/or chromatin [110].Therefore, the cellular concentration of TFIIA, the ratio of cleaved and uncleaved TFIIA, and its ability to bind to TBP/TFIID play critical roles in the regulation of PIC nucleation.Mutational and depletion studies targeting the TBP-TFIIA binding interface in yeast showed a decrease in transcriptional activity at various promoters, suggesting the impact of TFIIA on transcriptional output is promoter-specific [111,112].Together, the literature surrounding TFIIA shows it supports transcription through diverse mechanisms that facilitate PIC assembly, recruitment of GTFs, and interactions with co-transcriptional activators and repressors.

TFIIB
TFIIB is a single-subunit (33 kDa) protein.In the ordered mechanism of PIC assembly, TFIIB is recruited to stabilize the DNA-TBP-TFIIA complex, then recruits Pol II/TFIIF to the PIC [39,40].In addition, TFIIB helps specify the TSS and orient Pol II binding to ensure proper PIC directionality [113].Within the core promoter, TFIIB recognizes the upstream (−38 to −32) and downstream (−23 to −17) TFIIB recognition elements (uBRE and dBRE, respectively) that surround the TATA box [114,115].The BREs are often found in TATA-less promoters, allowing TFIIB to enhance TFIID binding in the absence of the TATA box [79].While TFIIB can bind to the uBRE in the absence of TBP, recognition of the dBRE requires prior binding of TBP [114,115].
TFIIB has multiple important functional domains, including the B core domain, Nterminal B ribbon region, and B reader domain.The B core domain contains two cyclin repeats that recognize the BREs and properly position TFIIB on the promoter.The two cyclin repeats of TFIIB do not have the same affinity for the TBP-DNA complex, leading to a strong preference for a single orientation of TFIIB binding and thus ensuring that initiation cannot erroneously occur upstream of the TATA element [116,117].A yeast Pol II-TFIIB co-crystal structure revealed that TFIIB positions the promoter DNA near the Pol II cleft to allow for active site access and stabilization by downstream GTFs [19].The B ribbon functions as a molecular switch governing conformational changes taking place within the B core cyclin repeats upon DNA binding [118,119].The B reader domain contains multiple motifs, including the B reader helix, loop, and strand regions that directly interact with Pol II.A yeast co-crystal structure showed the B reader loop extends into the mRNA exit channel of Pol II, possibly guiding the nascent RNA away from the template [120].Obstruction of the exit channel suggests that TFIIB must either change conformation or release from the complex during early elongation.In vitro single-molecule studies demonstrated that synthesis of 7-and 9-nucleotide RNA transcripts triggers TFIIB release, as predicted by previous structural data [57].Multiple structural changes throughout initiation and early transcription, combined with direct contacts with Pol II, TBP, and TFIIF, emphasize TFIIB as a critical participant in regulating the transcription reaction.

TFIIF
TFIIF was first identified as RNA Pol II-associated proteins, hence the RAP designation for its two RAP30 and RAP74 subunits [121].TFIIF is thought to associate with Pol II away from the promoter DNA [122,123].This interaction strongly inhibits non-specific DNA binding and initiation by Pol II, analogous to the bacterial σ factor [124].Structural studies show a charged helix domain of RAP74 binding Rpb2 in the Pol II lobe, Rpb9 in the Pol II jaw, and Rpb1 [78,91,104].Photo-crosslinking studies localized RAP30 just downstream of TBP (−19) and RAP74 just upstream of the TSS (−15 to −5) [125].Data obtained using other structural methods agree with this positioning of RAP30 and show its winged-helix domain cooperating with TFIIB to stabilize the promoter DNA between −23 and −13 [78,91,126].Both TFIIF subunits contain winged-helix domains with strong DNA-binding activity, which enhances Pol II/TFIIF affinity for the promoter and increases the overall stability of the PIC [127,128].TFIIF also induces important structural changes in the PIC and topological changes in the DNA upon recruitment, including wrapping the DNA around Pol II [91,128].
Biochemical studies support a unique role for TFIIF as an early Pol II elongation factor by enhancing the transition to productive elongation and facilitating early RNA synthesis via suppressing abortive transcription [129,130].Data suggest TFIIF increases the rate at which the first few phosphodiester bonds are formed, thus ensuring that nascent transcripts become long enough to resist abortive transcription [129].TFIIF uses its position near the Pol II cleft to help maintain proper alignment of the 3 ′ end of the mRNA transcript in the Pol II active site, which helps prevent Pol II backtracking [131].TFIIF is believed to remain associated with the polymerase through early transcription, as shown by results of biochemical studies [123,132,133], ChIP-qPCR [134], and ChIP-exo [132].However, in cells this association is likely dynamic.Comparing structures of Pol II bound to various elongation factors reveals that many of these factors share overlapping binding interfaces on Pol II [135,136].This mutual exclusivity due to shared binding sites allows for further temporal regulation of the early stages of transcript synthesis.Through its extensive network of protein-protein and protein-DNA contacts, TFIIF confers important stability to the PIC and regulates Pol II activity across early stages of transcription.

TFIIE and TFIIH
TFIIE consists of an αβ heterodimer with subunit masses of 56 and 34 kDa, respectively.TFIIE serves two main roles in PIC assembly: recruitment of TFIIH to the PIC and stimulation of multiple enzymatic activities of TFIIH [137][138][139].TFIIEα binds strongly to TFIIH through its C-terminal domain, while the N-terminal domain is required for stimulation of TFIIH activities [137,140,141].X-ray crystallography studies have shown that heterodimerization of TFIIE involves a winged-helix domain of TFIIEα directly contacting a winged-helix domain of TFIIEβ, and two coiled-coil helices that are intertwined with TFIIEα [142].TFIIEα also contacts TBP and TFIIF, and it contains a short region homologous to the bacterial σ factor [137].Using its zinc finger motif, TFIIEα enhances TBP binding at multiple promoter constructs in vitro and contacts the Rpb7 subunit of the Pol II stalk [104,126,143,144].Biochemical studies have demonstrated TFIIEβ binding to Pol II, TBP, TFIIB, and TFIIF [139].More specifically, some data show the winged-helix motifs of TFIIEβ contacting the winged-helix domain of RAP30 [104].In addition to proteinprotein contacts, TFIIEβ also makes extensive protein-DNA contacts; TFIIEβ contains three winged-helix motifs used to bind double-stranded promoter DNA just upstream of the TSS (−14 to −2) [91,125,145].Along with a winged-helix domain of RAP30 and a winged-helix domain of TFIIEα, there are a multitude of stabilizing interactions upstream of the transcription start site that trap the DNA against Pol II [91,126].TFIIEβ also has a basic helix-loop sequence that interacts with single-stranded DNA [141,145], which likely stabilizes the melted promoter DNA during open complex formation [146].
TFIIE exhibits more dynamic binding behavior than the other GTFs, with rather low stability within PICs [58,147].TFIIE binding is also severely impaired at promoters without TFIIF and Pol II [58].This lower stability has been suggested in structural data, where TFIIE cannot be resolved in the absence of TFIIH, or TFIIE must be added in excess to allow visualization [91,126].This suggests cooperative and/or simultaneous binding of TFIIE and TFIIH, and that TFIIH increases stability of TFIIE in the assembling PIC.Multiple studies have also suggested that TFIIE plays a vital role in facilitating the polymerase in clearing the core promoter and transitioning from initiation to elongation [148][149][150].
TFIIH is a 10-subunit complex composed of two domains: the 7-subunit core domain and the 3-subunit CAK domain [151,152].The core domain contains the XPB and XPD subunits with translocase and ATPase activities, while the CDK7 subunit of the CAK domain is responsible for TFIIH kinase activity.XPD forms a structural anchor between the core and CAK domains and is not enzymatically involved in Pol II transcription; however, XPD is involved in the DNA damage response via nucleotide excision repair [152][153][154][155]. Cryo-EM structures of free TFIIH suggest that the XPD subunit is at least partially inhibited when the CAK domain is present in the TFIIH complex, possibly serving as a source of regulation for the multi-purpose roles of TFIIH in DNA repair and transcription [151,156].Upon binding to the PIC at the promoter, TFIIH undergoes conformational shifts and transitions to an active form, where XPB and CDK7 are then used during initiation [151].The translocase activity of XPB is important for promoter melting, which allows Pol II to access the template strand DNA.Structural data have shown that the XPB subunit must displace a portion of the TFIID complex in order to contact the promoter DNA, thus allowing DNA opening to occur [78,89].The kinase activity of CDK7 is critical for phosphorylation of the Pol II CTD at Ser5 residues, allowing for Pol II to transition to its hyper-phosphorylated form, which promotes initiation followed by promoter escape [157].Both TFIIH enzymatic activities are points of regulation in the transcription reaction and are critical for productive transcription, as described in the following sections.

Promoter Melting and Initiation
Once the PIC has been properly assembled on the promoter DNA, the enzymatic activities of TFIIH facilitate promoter melting and initiation.For Pol II to access the template DNA strand, the double-stranded promoter DNA must first be melted around the TSS utilizing the ATP-dependent translocase (XPB subunit) activity of TFIIH.XPB contacts the DNA downstream of the TSS (+10 to +20) and uses its 5 ′ to 3 ′ translocase activity to twist the non-template DNA strand, while pushing it back toward the Pol II cleft [14,78,104,126,158].Since the DNA is bound by TBP, TFIIA, and TFIIB upstream of the TSS, translocating the DNA toward the Pol II cleft causes mechanical and torsional strain.This facilitates DNA unwinding around the transcription start site, forming an open DNA conformation/transcription bubble.GTFs bound to the promoter stabilize the open transcription bubble to prevent reannealing of the separated DNA strands [159].TFIIE and TFIIF together form a surface of four winged-helix domains that stabilize the singlestranded DNA and prevent it from leaving the Pol II cleft [104,126].The template DNA strand then moves into the base of the Pol II cleft where the active site resides [126,152].Pol II utilizes free NTPs and metal ion catalysis to initiate transcription and synthesize the first phosphodiester bond of the mRNA [160,161].The Pol II active site then translocates to the next position on the DNA template for the next round of catalysis (i.e., NTP binding, phosphodiester bond formation, and translocation) [162].
Also important for initiating complexes is phosphorylation of the Pol II CTD on Ser5 residues.TFIIH utilizes the CDK7 kinase of its CAK domain to phosphorylate the heptapeptide repeats at Ser5 [152].Within a PIC, the CTD is thought to make extensive contacts with Mediator to help position Pol II, and phosphorylation by CDK7 disrupts these contacts, fa-cilitating the transition to initiation [163].In addition, multiple studies have shown that Ser5 phosphorylation recruits mRNA 5 ′ capping machinery to the nascent RNA, consistent with capping occurring co-transcriptionally [164][165][166].Phosphoproteomic studies have shown CDK7 to target many other transcription-associated substrates, and inhibition of CDK7 in cells led to defects in splicing, cell-cycle regulation, and RNA processing [43,[165][166][167][168].Experiments show that CDK7 inhibition alters Pol II occupancy across genes and transcribed RNA in patterns that suggest defects not only in initiation, but promoter proximal pausing and termination [169,170].Taken together, these results showcase the critical role that the CDK7 kinase plays not only during initiation but also in multiple stages of transcription and co-transcriptional RNA processing.

Promoter Escape
During early transcription, transcribing complexes undergo numerous rearrangements as initiation complexes transform into elongation complexes and Pol II begins to move away from the start site of transcription.This stage of the transcription reaction, referred to a promoter escape, is less well defined than other stages.The complexity of promoter escape lies in the wide array of structural transformations that must occur as Pol II transitions from an initiation complex into an elongation complex.Protein-protein and protein-nucleic acid contacts formed in PICs must be broken, while other contacts are established, and several GTFs are thought to release from the transcribing complex [171,172].In addition, the melted region of DNA expands then collapses to the size maintained during elongation [173].Due to the transient nature of promoter escape, this stage of the transcription reaction is difficult to assay in cells.Therefore, current understanding relies largely on structural and biochemical approaches.In vitro kinetic studies show the rate-limiting step of promoter escape is complete when an 8-nucleotide RNA is made [168], and the polymerase clears the core promoter via synthesis of the initial 20-30 nucleotides of mRNA [171].
During promoter escape, early transcribing complexes have two fates: continue transcribing toward productive elongation or abortive transcription.Abortive transcription occurs when Pol II halts transcription after synthesizing a short mRNA product and releases from the template DNA.Research has shown that only a small fraction (~5-20%) of all PICs that form in vitro continue on to productive elongation [174][175][176].Within cells, evidence suggests that less than 15% of Pol II molecules interacting with genes reach the initiation step, and an even smaller fraction proceed to elongation.This leads to a low percentage of Pol II/DNA interactions producing an mRNA transcript [60,63,177,178].The biological significance of having such a small population of active complexes is a topic of great interest and requires distinguishing between a small population of productive complexes and the population of inactive complexes/interactions.While the cause of this heterogeneity in activity is unknown, it is likely that only the complexes that successfully complete all transformations that occur during promoter escape can proceed to elongation, which prevents improperly assembled complexes from continuing to elongate the transcript.These transformations include conformational changes among Pol II and the GTFs, correct positioning of the DNA in the Pol II cleft, and recruitment of elongation factors and other co-transcriptional machinery, among others.
During the transition to an elongation complex, Pol II needs to break contacts with promoter-bound GTFs as it transcribes away from the TSS.In vitro experiments show that TFIIB, TFIIE, and TFIIH release at points during early transcription [57,147]; TFIIF remains bound to Pol II during promoter escape but is not stably associated with Pol II later in elongation [123,132,133].Studies suggest that TFIIF facilitates promoter escape by suppressing abortive transcription, restoring stalled Pol II complexes, and enhancing the effects of other positive elongation factors, such as TFIIS [179].Biochemical studies suggest that TFIID remains bound at the core promoter, and TFIIB can re-associate with TFIID [133,180].This may serve as a transcriptional "memory" to mark actively transcribed genes as cells progress through the cell cycle.For example, as cells pause transcription upon entry into mitosis, chromatin becomes compacted, and Pol II is removed from the genome; however, many sites still show high levels of TFIIB and TFIID occupancy through this period of gene silencing [181].

Promoter Proximal Pausing
The next major regulatory checkpoint is promoter proximal pausing (PPP), which occurs at nearly all Pol II-transcribed metazoan genes around 30-100 bases downstream of the TSS [182][183][184].PPP is a highly regulated event in which Pol II pauses transcription and can either undergo premature termination or release into the gene body to proceed through transcript elongation.Multiple biological systems utilize PPP regulation to achieve temporal transcriptional control, including the immune response [185], hormone signaling [186,187], and early development [188,189], underlining the broad physiological relevance of this phenomenon.
From a mechanistic perspective, PPP is predominantly caused by binding of DRB Sensitivity Inducing Factor (DSIF) and Negative Elongation Factor (NELF) [190] (Figure 3).As Pol II escapes the promoter region, DSIF is recruited.DSIF is a heterodimer of Spt4 and Spt5 subunits, the latter of which contacts Pol II near the mRNA exit channel and facilitates 5 ′ end capping of the mRNA [191,192].The Spt5 subunit of DSIF contacts the same Pol II interface as TFIIE; therefore, DSIF may be recruited as TFIIE is released [193,194].DSIF binds to NELF, or Negative Elongation factor, which stabilizes the paused complex and extends the lifetime of the paused state [190,[195][196][197]. Pause release occurs with the recruitment of positive transcription elongation factor b (P-TEFb), which contains the CDK9 kinase.CDK9 phosphorylates both NELF and the Spt5 subunit of DSIF, triggering dissociation of NELF [196,198].DSIF remains bound to the elongation complex after phosphorylation, stimulating Pol II elongation and recruiting other elongation factors [183].In addition to phosphorylating Spt5 in DSIF, CDK9 also phosphorylates other elongation factors, chromatin modifiers, RNA processing factors, and Ser2 on the Pol II CTD, a known mark of active elongation [199,200].Therefore, P-TEFb broadly encourages productive elongation and pause release through multiple phosphorylation targets in the elongation complex.Promoter-proximally paused complexes can undergo premature termination as opposed to release into productive elongation through mechanisms involving the multisubunit Integrator complex [201].The ratio of complexes released into elongation versus prematurely terminating and the rate of turnover of paused Pol II complexes are open areas of investigation and likely provide an additional layer of regulation for specific genes within different cell types.
elongation [123,132,133].Studies suggest that TFIIF facilitates promoter escape by suppressing abortive transcription, restoring stalled Pol II complexes, and enhancing the effects of other positive elongation factors, such as TFIIS [179].Biochemical studies suggest that TFIID remains bound at the core promoter, and TFIIB can re-associate with TFIID [133,180].This may serve as a transcriptional "memory" to mark actively transcribed genes as cells progress through the cell cycle.For example, as cells pause transcription upon entry into mitosis, chromatin becomes compacted, and Pol II is removed from the genome; however, many sites still show high levels of TFIIB and TFIID occupancy through this period of gene silencing [181].

Promoter Proximal Pausing
The next major regulatory checkpoint is promoter proximal pausing (PPP), which occurs at nearly all Pol II-transcribed metazoan genes around 30-100 bases downstream of the TSS [182][183][184].PPP is a highly regulated event in which Pol II pauses transcription and can either undergo premature termination or release into the gene body to proceed through transcript elongation.Multiple biological systems utilize PPP regulation to achieve temporal transcriptional control, including the immune response [185], hormone signaling [186,187], and early development [188,189], underlining the broad physiological relevance of this phenomenon.
From a mechanistic perspective, PPP is predominantly caused by binding of DRB Sensitivity Inducing Factor (DSIF) and Negative Elongation Factor (NELF) [190] (Figure 3).As Pol II escapes the promoter region, DSIF is recruited.DSIF is a heterodimer of Spt4 and Spt5 subunits, the latter of which contacts Pol II near the mRNA exit channel and facilitates 5′ end capping of the mRNA [191,192].The Spt5 subunit of DSIF contacts the same Pol II interface as TFIIE; therefore, DSIF may be recruited as TFIIE is released [193,194].DSIF binds to NELF, or Negative Elongation factor, which stabilizes the paused complex and extends the lifetime of the paused state [190,[195][196][197]. Pause release occurs with the recruitment of positive transcription elongation factor b (P-TEFb), which contains the CDK9 kinase.CDK9 phosphorylates both NELF and the Spt5 subunit of DSIF, triggering dissociation of NELF [196,198].DSIF remains bound to the elongation complex after phosphorylation, stimulating Pol II elongation and recruiting other elongation factors [183].In addition to phosphorylating Spt5 in DSIF, CDK9 also phosphorylates other elongation factors, chromatin modifiers, RNA processing factors, and Ser2 on the Pol II CTD, a known mark of active elongation [199,200].Therefore, P-TEFb broadly encourages productive elongation and pause release through multiple phosphorylation targets in the elongation complex.Promoter-proximally paused complexes can undergo premature termination as opposed to release into productive elongation through mechanisms involving the multi-subunit Integrator complex [201].The ratio of complexes released into elongation versus prematurely terminating and the rate of turnover of paused Pol II complexes are open areas of investigation and likely provide an additional layer of regulation for specific genes within different cell types.Ongoing research studies have used inhibitors to probe the regulation and biological function of this important regulatory step in early transcription.Studies with inhibitors of CDK9 have shown that PPP is essential for productive transcription [202,203].It is possible that PPP occurs to allow for conformational changes within the early elongation complex to take place, leading to higher stability of the Pol II transcription complex.Pausing may also allow for the recruitment of the full suite of elongation factors necessary for Pol II to transition to productive elongation.Additionally, PPP adds yet another layer of transcriptional regulation in response to changing cellular conditions: Pol II may undergo continuous cycles of PIC assembly, elongation to the PPP site, pausing, and termination until a signal allows for the transition to productive elongation [197].

Elongation
After release from promoter proximal pausing, Pol II transitions to an elongation complex that can productively synthesize RNA.The elongation stage of Pol II transcription, and its regulation, is complex due to the involvement of a multitude of cofactors, regulators, chromatin interactions, and co-transcriptional processes.Elongation by Pol II is reviewed in detail elsewhere [204,205], with key points emphasized here.During elongation, mRNA is synthesized at speeds > 2000 nucleotides per minute, as measured in cells after releasing Pol II from a drug-induced pause [202].This rate drops to 300 nucleotides per minute using purified Pol II in an in vitro system [206], emphasizing the importance of elongation factors interacting with Pol II.Elongation factors include Poly (ADP-ribose) Polymerases (PARPs), elongation factor for RNA Pol II (ELL), TFIIS, elongin A, DSIF, and Spt6, among others [207].Some of these elongation factors can also form subcomplexes of different compositions that work together to modulate Pol II elongation, such as the super elongation complex (SEC) consisting of P-TEFb, ELL proteins, and AFF family members [208,209].
Another important complex that co-regulates Pol II elongation activity is the Polymeraseassociated factor 1 Complex, or Paf1C [210,211].This complex of six subunits in humans (five subunits in some organisms) is highly conserved across different levels of eukaryotes.Loss of Paf1C from mammalian cells causes accumulation of Pol II on gene bodies and slower elongation rates, while in vitro studies have shown a direct stimulatory role of Paf1C on elongation efficiency [211].Paf1C interacts directly with the phosphorylated Pol II CTD tail and DSIF in the elongation complex [212][213][214][215]. Paf1C is recruited after the CDK9 kinase of P-TEFb phosphorylates NELF and DSIF, which leads to NELF dissociation, enabling Paf1C binding to DSIF.Some studies suggest that Paf1C and CDK9 share a mutual dependence for recruitment to active chromatin, thus implicating Paf1C in promoter proximal pausing.However, the exact role that Paf1C plays in pausing regulation remains unclear and is an ongoing point of investigation [216][217][218][219]. Paf1C has also been shown to associate with a myriad of other proteins, including gene-specific transcription factors and factors associated with developmental signaling pathways [220,221].Due to its interactions with a vast number of factors, Paf1C is an important regulatory point for Pol II elongation.
During elongation, Pol II must transcribe through nucleosomal DNA, which is completed with the help of chromatin remodelers such as Chd1 and FACT [222,223].As Pol II transcribes through the genome at high speed, these factors rearrange histone complexes ahead of the polymerase and replace histones behind the elongation complex [4,224].Recent evidence also suggests that FACT recycles nucleosomes to maintain epigenetic modifications on the histone tails, thus maintaining the chromatin state through rounds of transcription [225].Studies have implicated Paf1C in chromatin modifications and epigenetic control through direct contacts with related factors, including FACT [226,227] and Chd1 [228,229].Additionally, Paf1C has been associated with the maintenance of several histone marks for active chromatin, further emphasizing its role as an important regulator of Pol II productive elongation [229][230][231][232].
Throughout the elongation and ultimately termination phases of transcription, the phosphorylation state of the Pol II CTD changes, which is thought to help recruit and regulate elongation factors and co-transcriptional machinery.For example, Ser2 phosphorylation increases as Pol II moves throughout the gene body, which signals the entry of Pol II into the elongation phase and facilitates association of elongation, termination, splicing, and nuclear export factors with the transcribing complex [36,[233][234][235][236][237].Tyr1 phosphorylation, which peaks at the promoter proximal pause, is present within the gene body before falling near the 3 ′ end of the gene.In yeast, it has been shown that Tyr1 phosphorylation impairs recruitment of termination factors to prevent premature termination, thereby acting to promote elongation [238].Accordingly, dephosphorylation of Tyr1 is also important for proper termination in yeast [239].Consistent with this, in human cells, mutating the majority of Tyr1 residues to phenylalanines in the Pol II CTD caused a genome-wide termination defect [240].In both yeast and humans, Thr4 follows a similar phosphorylation profile to Ser2, becoming phosphorylated throughout the gene body and peaking at the 3 ′ end of genes [23].ChIP-seq studies in human cells revealed that Thr4 is essential for productive elongation and that mutation of Thr4 leads to a genome-wide elongation defect [241].Moreover, in human cells Thr4 phosphorylation is thought to be important for 3 ′ end processing of nascent transcripts [38].

Termination
The final stage in a single round of transcription is termination.This step is described briefly below (Figure 4) and is reviewed more thoroughly elsewhere in the literature [242][243][244][245]. Elongation proceeds until Pol II transcribes through the polyadenylation site (PAS; AAUAAA in human mRNA transcripts), which signals for transcription termination.The polymerase slows down during this phase of the reaction, which is facilitated by dephosphorylation of the Spt5 subunit of DSIF by the PP1-PNUTS phosphatase complex [246].The cleavage and polyadenylation complex (CPA) associates with Pol II, likely due to CTD interactions.The CPA, constituting the core of the termination machinery, contains an array of multi-subunit factors that recognize the PAS and other sequences in the mRNA.This complex includes an endonuclease (CPSF73) that cleaves the nascent mRNA to generate the 3 ′ end of the mRNA, and a polyA polymerase that attaches up to several hundred adenosine residues to the 3 ′ end [242][243][244][245].Polyadenylation of the 3 ′ end protects the mRNA from exonuclease degradation and aids in nuclear export and translation [247].Some genes contain multiple PAS sequences, allowing for alternative termination and polyadenylation of the mRNA product akin to alternative splicing variants [248].Paf1C is also involved in termination by regulating co-transcriptional processes, including the cleavage and polyadenylation machinery [249][250][251], as well as post-transcriptional processes, including mRNA export machinery [250][251][252].Cleavage of the nascent mRNA by CPSF73 occurs just downstream of the polyA site, which facilitates termination through two mechanisms that are not mutually exclusive: the "allosteric" model and the "torpedo" model [244].The former model suggests that allosteric changes occur within the elongation complex and/or the nascent mRNA, which prompt transcribing Pol II to release the DNA template and its mRNA product [253,254].In the torpedo model, after cleavage of the nascent mRNA, the exonuclease Xrn2 accesses the 5′ end of the nascent RNA strand behind the still transcribing Pol II and races to catch and displace Pol II, resulting in termination [246].It is likely both of these models contribute to regulation of termination [255].After dissociation, Pol II is then free to assemble into other transcription complexes, and the new mRNA product is exported to the cytosol Cleavage of the nascent mRNA by CPSF73 occurs just downstream of the polyA site, which facilitates termination through two mechanisms that are not mutually exclusive: the "allosteric" model and the "torpedo" model [244].The former model suggests that allosteric changes occur within the elongation complex and/or the nascent mRNA, which prompt transcribing Pol II to release the DNA template and its mRNA product [253,254].In the torpedo model, after cleavage of the nascent mRNA, the exonuclease Xrn2 accesses the 5 ′ end of the nascent RNA strand behind the still transcribing Pol II and races to catch and displace Pol II, resulting in termination [246].It is likely both of these models contribute to regulation of termination [255].After dissociation, Pol II is then free to assemble into other transcription complexes, and the new mRNA product is exported to the cytosol for further processing and translation.

Summary and Future Perspectives
The complexity of the Pol II transcription system has challenged researchers for decades as they work to understand the many layers of regulation, protein factors involved, and co-transcriptional processes.The field has elucidated general factors and fundamental mechanisms at each of the major stages of transcription (PIC assembly, initiation, promoter escape, PPP, elongation, and termination), but there are still many unanswered questions left to investigate.Structural biologists are achieving high-resolution structures of Pol II complexes and their interacting partners, which will continue to provide mechanistic insight into how these large and flexible complexes function.This will be complemented by studies using imaging technologies (e.g., biochemical single-molecule and live-cell single-particle microscopy) to reveal new modes of regulation by dynamically tracking the GTFs, ultimately through the stages of transcription in real-time.Research dedicated to understanding intergenic regions of the genome will uncover more regulatory mechanisms for gene expression, including enhancers and long non-coding RNAs.Advancements in sequencing methods, bioinformatics, proteomics, and computational biology will illustrate how Pol II works with various partners to regulate gene expression, and how the genome itself is structured to globally modulate the Pol II system.The immense complexity of transcription and the human genome provides not only a great challenge for researchers but also incredible reward in understanding how the two are intricately connected and balanced.

Figure 1 .
Figure 1.Phosphorylation state of the Pol II CTD is regulated during transcription.As Pol II transcribes through a gene and progresses through the stages of transcription (shown from left to right), different phosphorylation marks are added or removed to promote unique functions.The phosphorylation patterns shown here pertain to human Pol II; other organisms may exhibit slight differences in these patterns.TSS, transcription start site; PAS, polyadenylation site.

Figure 1 .
Figure 1.Phosphorylation state of the Pol II CTD is regulated during transcription.As Pol II transcribes through a gene and progresses through the stages of transcription (shown from left to right), different phosphorylation marks are added or removed to promote unique functions.The phosphorylation patterns shown here pertain to human Pol II; other organisms may exhibit slight differences in these patterns.TSS, transcription start site; PAS, polyadenylation site.

Figure 2 .
Figure 2. Two mechanisms for PIC formation, which are not mutually exclusive.(a) In the stepwise model, Pol II and the GTFs assemble in a particular order facilitated by one factor recruiting the subsequent factor via protein-protein interactions.(b) In the holoenzyme model, minimally TFIID and TFIIA assemble at the promoter while Pol II and remaining GTFs form a subcomplex that binds to the promoter, completing PIC assembly.Recent advances in our understanding of PICs come from the advent of new technologies such as cryo-EM and single-molecule imaging.A profusion of cryo-EM studies in recent years has provided detailed pictures of the architecture of PICs and other

Figure 2 .
Figure 2. Two mechanisms for PIC formation, which are not mutually exclusive.(a) In the stepwise model, Pol II and the GTFs assemble in a particular order facilitated by one factor recruiting the subsequent factor via protein-protein interactions.(b) In the holoenzyme model, minimally TFIID and TFIIA assemble at the promoter while Pol II and remaining GTFs form a subcomplex that binds to the promoter, completing PIC assembly.

Figure 4 .
Figure 4. Pol II requires a complex network of factors to facilitate termination.Pol II transcribes through the gene body at over 2 kilobases per minute(1).Subunits in the CPA complex recognize the PAS in the transcript RNA, along with other regulatory sequences.The CPSF73 endonuclease subunit cleaves the nascent RNA to generate the 3′ end of the mRNA, which undergoes further processing such as addition of up to several hundred adenosine residues to the 3′ end of the cleaved mRNA by polyA polymerase (2).PP1/PNUTS dephosphorylates the Spt5 subunit of DSIF (gray arrow), which causes Pol II to decelerate(3).The XRN2 exonuclease binds to the 5′ end of the nascent transcript, digesting it until it until Pol II is dislodged from the genome (4).

Figure 4 .
Figure 4. Pol II requires a complex network of factors to facilitate termination.Pol II transcribes through the gene body at over 2 kilobases per minute (1).Subunits in the CPA complex recognize the PAS in the transcript RNA, along with other regulatory sequences.The CPSF73 endonuclease subunit cleaves the nascent RNA to generate the 3 ′ end of the mRNA, which undergoes further processing such as addition of up to several hundred adenosine residues to the 3 ′ end of the cleaved mRNA by polyA polymerase (2).PP1/PNUTS dephosphorylates the Spt5 subunit of DSIF (gray arrow), which causes Pol II to decelerate (3).The XRN2 exonuclease binds to the 5 ′ end of the nascent transcript, digesting it until it until Pol II is dislodged from the genome (4).

Table 1 .
Summary of the general transcription factors and RNA polymerase II.