Regulation of Transcription Elongation and Termination

This article will review our current understanding of transcription elongation and termination in E. coli. We discuss why transcription elongation complexes pause at certain template sites and how auxiliary host and phage transcription factors affect elongation and termination. The connection between translation and transcription elongation is described. Finally we present an overview indicating where progress has been made and where it has not.


Introduction
This article will review our current understanding of transcription elongation and termination in E. coli. Because of the large scope of the article, we have largely confined ourselves to recent manuscripts. Older references have, in general, been omitted except when necessary or when they consist of our own work.
RNA synthesis by bacterial RNA polymerases, although processive, does not proceed at a uniform rate. Template sequences can induce pausing or backtracking-movement of the transcription elongation complex (TEC) towards the promoter. Forward or backward movement of TEC does not entail loss of energy. The RNA:DNA hybrid, 9 nt in the post-translocated state and 10 nt in the pre-translocated state, is maintained independently of the direction of RNAP movement.
Translocation speed and direction is also influenced by accessory transcription factors. NusG and NusA suppress or enhance pausing, respectively. UvrD can push TEC backwards to reveal damaged DNA, and Mfd can push it forward as part of the transcription-coupled repair process. The GreA/B proteins can rescue backtracked TEC by removing extruded RNA and restoring the 3'-end of the nascent transcript to the TEC active center. Backtracking is also suppressed when TEC is coupled to translating ribosomes, which present a barrier to retrograde translocation. Phage functions affect transcription elongation. Ȝ N protein accelerates transcription by inhibiting pausing, whereas HK022 arrests transcription by preventing TEC translocation.
Transcription terminates at the ends of operons or under certain conditions, within genes. Termination can be template-encoded and factor-independent (intrinsic termination), or require accessory factors, such as Rho, Mfd and DksA.
There has been considerable progress in understanding these aspects of transcription elongation, although areas of controversy remain. We will summarize the new findings and try to smooth out the contradictions in the following pages.

Pausing
The rate of transcription elongation by E. coli RNAP is not uniform. RNA synthesis is characterized by pauses, some of which may be brief and resolved spontaneously, whereas others may lead to TEC backtracking. Backtracked TEC can restart when acted upon by GreA/B factors, which restore the 3'-end of the nascent transcript to the active center. Pausing may regulate gene expression, as in the case of attenuation or phage Ȝ gene Q antitermination. In this section we will discuss recent advances in understanding the mechanism and specificity of pausing.
Elongation rate and pausing are determined by template sequence and RNA structure (e.g., stem-loops) and involve at least two components of the RNAP catalytic center, the bridge helix (BH) and trigger loop (TL). Elongation is proposed to occur in two steps. First, the TL folds in response to NTP binding. Mutational analyses indicate that this conformational change in the TL can be rate-limiting, and reflects the ability of the incoming NTP to bind to TEC. The second step is the incorporation of the NTP and the release of pyrophosphate. Kinetic conformational changes in the TL, however, do not account for pause recovery, since the TL remains unfolded during a pause [1].
Pausing not associated with backtracking is frequent, occurring on average every 100 bases of DNA [2]. The paused intermediates are distinct from the intermediates of the main reaction pathway, and they are not associated with translocation delay. The paused complex contains the 3'-end of the transcript in the active center and is capable of binding the next cognate NTP. It is highly dependent on the NMP at the 3'OH end of the nascent transcript. For example, pausing at C37 on a T7A1 template is significantly reduced when the template substitutes a G37 for C37. Substitutions at position 38 also affect pause times, which might indicate the ability of the incoming XTP to bind to TEC. These considerations have lead to the idea that the 3'OH base may not be fully base-paired with the template, even though it lies in the post-translocated state.
Zenkin and his coworkers [3] analyzed pauses that result from failure of TEC to translocate from the elongation-inactive pre-translocated state to the active post-translocated state. These pauses reflect the ability of RNAP core to sense the identity of base pairs at most of the positions of the RNA-DNA hybrid. It is not clear if the sequence or the structure of the hybrid induces pausing. Some of these pauses are associated with "backstepping", i.e., movement of TEC towards the promoter by one bp, with associated displacement of the 3' ribonucleotide from the active center.
A genome-wide in vivo analysis of TEC occupancy on the E. coli chromosome defined almost 20,000 pause sites [4]. Analysis of these sites revealed a consensus sequence that consists of G-10 Y-1 G+1 (where -1 corresponds to the position of the RNA 3' end). A similar result was obtained by Larson et al. [5] This sequence is proposed to induce pausing through an interaction between RNAP core enzyme and a core recognition element (CRE) located at the 3'-end of the RNA:DNA duplex. The interaction stabilizes TEC in the pre-translocated state, thus inhibiting addition of the next nucleotide to the nascent transcript. The G-10 favors the pretranslocated state by enhancing duplex stability; each position of the consensus pause sequence is predicted to favor the pretranslocated state over the posttranslocated state (the -10G through effects on duplex stability, the -1 Y through effects on active-center binding, and the +1 G through both). Mutational probing of RNAP supports this model. Thus, RNAP ȕD446 hydrogen bonds with Watson-Crick atoms of G complexed with CRE, suggesting that D446 recognizes this nucleotide. As predicted, RNAP ȕD446A cannot distinguish G, A, T, or an abasic site at position G+1 in vitro and pauses with equal efficiency on the various templates. However, the mutant RNAP is more likely than wild-type enzyme to be in the pretranslocated register on the G+1 template, i.e., more likely to pause. This observation is curious, and does not fit readily with the above model.
Interestingly, the pause-inducing consensus sequence is enriched at translation start sites in both E. coli and Bacillus subtilis [5]. It is conceivable that these pause sites play a regulatory role coupling transcription and translation by allowing linkage of the lead ribosome to RNAP.
Pauses also occur at sites resembling the promoter sequence to which the RNAP sigma 70 subunit binds [6,7]. Sigma 70 engages the promoter-like sequence and TEC, which briefly continues RNA synthesis. Sigma-dependent pausing generates stressed elongation complexes that are resistant to GreA and GreB cleavage, suggesting that the 3'-end of the RNA is in the active site, as would be expected in a paused, scrunched complex. The scrunched complexes are resolved either by breakage of the TEC-sigma 70-promoter-like sequence, or by isomerization to a backtracked conformation. TEC involved in this type of pause may consist of persistent holoenzymes, in which ı70 regions 1.2 and 2 remain in contact with the RNAP core. The pausing frequency of elongating holoenyzme is not clear, and may depend on the growth rate of the bacteria [8].

Intrinsic Termination
Intrinsic termination occurs at specific template sequences -an inverted repeat followed by a run of A residues. Termination is driven by formation of a short stem-loop structure in the nascent RNA chain. RNA synthesis arrests and TEC dissociates at the 7th and 8th U of the run. Formation of the stem-loop dissociates the weak rU:dA hybrid. Stem-loop formation is hindered by upstream complementary RNA sequences that compete with the downstream portion of the stem, as well as by RNA: protein interactions in the RNA exit channel. Intrinsic termination depends critically upon timing. Hairpin folding and transcription of the termination point must be coordinated, so that the complete hairpin is formed by the time RNAP transcribes the termination point. The size of the stem, the sequence of the stem and the length of the loop all affect termination efficiency (see Figure 1).
Nedialkov et al. [9] have studied the role of RNAP domains in intrinsic termination. The bridge Į-helix in the ȕ' subunit borders the active site and may have roles in catalysis and translocation. Mutations in the YFI motif (ȕ' 772-YFI-774) affect intrinsic termination as well as pausing, fidelity and translocation of RNAP. One mutation, F773V, abolishes the activity of the Ȝ tR2 intrinsic terminator, although neighboring mutations have little affect on termination. Modeling suggests that this unique phenotype reflects the ability of F773 to interact with the fork domain in the ȕ subunit.

Rho-Dependent Termination
Transcription termination factor Rho is an essential protein in E. coli first identified for its role in transcription termination at Rho-dependent terminators [10], and is estimated to terminate ~20% of E. coli transcripts [11]. rho is highly conserved and nearly ubiquitous in bacteria [12]. Rho is an RNA-dependent ATPase [13] with RNA:DNA helicase activity [14], and consists of a hexamer of six identical monomers arranged in an open circle [15]. Transcription of the single copy of rho is regulated by Rho-dependent transcription termination at a Rho termination site located upstream of the structural gene [16].
Rho-dependent terminators are comprised of rut sites (rho utilization) and release sites [17]. Rho binds with a high affinity to the rut site RNA, is stimulated to hydrolyze ATP, and then translocates along the RNA in a 5' to 3' direction while maintaining an interaction with rut [18,19], until it encounters TEC at a release site. Termination requires untranslated RNA of at least 85-97nt [20].
RNA binds to two distinct sites in Rho, termed primary and secondary [21]. The primary site stably binds RNA in the absence of ATP. The secondary sites are stimulated to bind RNA transiently after the primary site is occupied. Secondary site binding stimulates ATP hydrolysis. Crystallography has identified the location of the primary site on the outer edge of the hexamer and the secondary site around the center hole [15,22]. The Rho hexamer initially binds RNA in an open "lockwasher" conformation. After RNA is bound to the primary site, the transcript is threaded through the central hole contacting the secondary binding site and the hexamer closes ( Figure 2, [23]). Single molecule experiments determined that Rho binds 57 ± 2 nucleotides of RNA in the absence of ATP and 80 ± 2 nucleotides upon ATP hydrolysis, consistent with 60 nt binding at the primary and 20 nt at the secondary site [19]. The transient interactions of RNA with the secondary site drive Rho translocation along the RNA until it encounters TEC at a release site. At this point, Rho releases RNAP from the template, presumably by unwinding the RNA-DNA hybrid. The precise mechanism of arrest and removal is unclear. Epshtein et al. [24] propose that Rho causes a conformational change in RNAP leading to arrest of TEC and exposure of the transcription hybrid to Rho. In this model, Rho then unwinds the RNA-DNA hybrid, removing RNAP. The authors argue that since crosslinking data indicates that the active site does not move in relation to the template during transcription termination, forward translocation does not contribute to termination. Park and Roberts [25], however, found that Rho induces forward movement of TEC, and that mispairings in the DNA template immediately 5' to the arrested TEC decrease the efficiency of Rho termination. Park and Roberts propose that Rho induces termination by pushing RNAP ahead of the transcription hybrid. The precise mechanism of removal is thus still unclear, but the above studies suggest that forward pressure on RNAP from Rho causes a conformational change leading to arrest. Continued pressure then exposes the transcription hybrid to Rho either by a conformational change in RNAP or by removal without forward translocation. Termination may not entail specific interactions with RNAP, since E. coli Rho factor will efficiently terminate transcription of Saccharomyces cerevisiae RNA pol II [26].
ChIP-chip analysis reveals that Rho associates with the TEC throughout transcription, rather than only after synthesis of an untranslated rut binding site [36]. Kalyani et al. [37] instead maintain that a transcribed rut element is required for Rho association with RNAP, and suggest that the ChIP-chip data does not reflect a relevant interaction between Rho and TEC. Single molecule studies show no evidence that Rho directly binds to RNAP [19], however, Epshtein et al. [24] did detect direct binding to RNAP in vitro. The reason for the above discrepancy remains unclear. Rho-dependent termination in E. coli occurs predominantly within the "foreign" DNA (e.g., cryptic prophages and transposons) that makes up ~14% of the genome of E. coli MG1655 [38]. Efficient transcription termination in E. coli is essential to suppress expression of toxic genes in this horizontally-acquired DNA [38]. Although Rho activity can be significantly reduced by nusA or nusG mutations in a strain deleted of all horizontally transferred DNA, rho cannot be deleted in this strain. Washburn and Gottesman proposed that Rho is essential to prevent collisions between TEC and the replisome [39]. Rho termination within coding sequences may depend upon ribosome release by tmRNA, which can uncouple the lead ribosome from RNAP [40].
The RNA-DNA helicase activity of Rho can unwind extensive RNA-DNA hybrids. Harinarayanan and Gowrishankar [41] suggest that Rho prevents RNA-DNA hybrids ("R-loops") from forming between untranslated mRNA and the chromosome. In support of this hypothesis, Leela et al. [42] found that rho could be deleted in a rac í strain expressing the RNA-DNA helicase uvsW. This model assumes that both Rho and UvsW prevent or remove potentially lethal R-loops. Nevertheless, rnhA deletion mutants are healthy despite extensive accumulation of R-loops. Nor are rnhA mutants more dependent on rho for survival than wild-type [43,44]. Perhaps Rho and UvsW share another essential activity, such as resolving conflicts between transcription and replication, or removing certain toxic R-loops sequestered from RNaseHI. Note that Dutta et al. [45], demonstrated that suppression of R-loop formation reduces transcription-replication clashes.
Peters et al. [46] found that inhibiting Rho or deleting nusG resulted in the accumulation of antisense transcripts. The antisense transcripts are untranslated, thus allowing Rho access to termination sites. Surprisingly, nusA does not contribute to termination of the antisense transcripts, despite the similarities in transcription patterns in nusG, nusA and cells treated with the Rho inhibitor bicyclomycin [38]. Whether or not this activity of Rho is important in vivo is unclear.
The efficiency of Rho termination is dependent on the rate of transcription elongation [47]. Slow RNAP mutants have a decreased affinity for NTPs (e.g., rpoB8, 5-fold higher Km for ATP) and are more efficient at Rho-dependent termination. Conversely, fast mutants have an increased affinity for NTPs (e.g., rpoB3595) and are less efficiently terminated. Enhanced termination was observed in vitro when the transcription rate was slowed by limiting NTP concentrations. This "kinetic coupling" model might explain the correlation between TEC pausing and Rho termination sites.
The histone-like nucleoid-structuring protein H-NS contributes to Rho-dependent termination. hns deletion mutations increase Rho dependency [43,44], and H-NS is concentrated at the antisense transcription terminators [46]. This is distinct from the role of H-NS in silencing foreign genes by repressing promoters [48]. Horizontally-transferred DNA is AT rich compared to E. coli sequences, which favors H-NS binding.

NusG, NusA and DksA
TEC are accompanied by transcriptional cofactors that affect the rate of elongation and specify loci of transcription termination. Among these are NusA, NusG, and the ppGpp cofactor, DksA.

NusG
The 21kDa E. coli NusG is composed of two domains connected by a flexible linker. It affects transcription elongation through a variety of mechanisms. The NusG-NTD directly suppresses pausing and thus enhances the overall rate of transcription elongation [49]. Structural studies with the archaeal NusG homologue, Spt5, suggests that the NusG-NTD enhances TEC processivity by completely encircling the DNA binding channel of RNAP, thus stabilizing the closed conformation of the RNAP clamp domain [50]. Single molecule analyses indicate that NusG decreases the rate of entry into both short-lifetime and, more significantly, long-lifetime pauses. Suppression of long-lifetime pauses is proposed to account for NusG-NTD enhancement of transcription elongation. According to this model, NusG-NTD increases movement of TEC along the DNA template by promoting transition from the pre-translocated towards the post-translocated register [51].
The NusG-CTD KOW domain interacts with NusE/S10, thus linking TEC to the lead ribosome. Coupling of transcription to translation suppresses backtracking and possible clashes with the replisome [45]. The NusG-CTD also binds to-and activates-termination factor Rho with the same interface with which it binds NusE/S10. Thus ribosome-associated NusG-CTD is not available to enhance Rho-dependent termination [52]. Linkage between the lead ribosome and TEC also suppresses formation of untranslated RNA, which is required for Rho to access TEC. Sequestering of the NusG-CTD and the absence of RNA ligand together account for the absence of Rho-dependent termination in well-translated genes.
In vivo, it is not known whether NusG associates first with RNAP, with ribosomes or simultaneously to both. Genome-wide surveys suggest that NusG associates with TEC only after significant elongation has occurred [36]. This is difficult to reconcile with the coupling hypothesis, since free TEC could be targeted by Rho (see below). On the other hand, the E. coli NusG paralogue, RfaH, appears to link ribosomes to TEC early after transcription initiation [53].
The in vivo calculated/reported numbers for NusG [54] is one sixth the number of ribosomes ( §55,000 copies/cell, [55]). This is consistent with the idea that only the first ribosome in translating polysomes associates with NusG and TEC.
However, ribosomal stalling at rare codons or induced by amino acid analogues can uncouple transcription from translation and induce intragenic Rho-dependent termination. The stalled ribosome is attacked by tmRNA, which competes for binding with NusG-CTD to S10 and releases the impacted ribosome [40].
Activation of Rho explains why NusG is essential in wild-type E. coli. The cryptic rac prophage carries a constitutive promoter and a downstream kil gene whose expression is lethal to the bacterial host. Rho-dependent termination prevents transcription extension from the promoter to kil. Deletion of the rac prophage allows E. coli to support a nusG deletion, although the mutant strain grows poorly and dies in stationary phase [38].
Oddly, B. subtilis NusG stimulates pausing at two regulatory sites in the untranslated leader of the B. subtilis trp operon that participate in transcription attenuation and translational control mechanisms, respectively. To induce pausing, B. subtilis NusG makes sequence-specific contacts with a T-rich sequence in the nontemplate DNA strand within the paused transcription bubble [56]. The E. coli NusG homologue, RfaH, makes similar contacts with the template ops element. Pausing at ops may allow RfaH to link to the lead ribosome and couple transcription to translation in ops-bearing operons [57].

NusA
E. coli NusA protein was originally identified genetically as a required component of the phage Ȝ N antitermination complex, and biochemically as a factor that stimulated lacZ gene expression in vitro [58]. RNAP is modulated by NusA protein and vice versa. NusA enhances pausing as well as termination at intrinsic termination sites. Paradoxically, it also suppresses transcription termination as part of the Ȝ N or rrn antitermination complexes. NusA is thought to provoke termination when present in 1:1 stochiometry with RNAP, and antitermination-as part of the NusBEG/Ȝ N complex-when the stoichiometry is 2:1. Numerous studies place NusA near the RNA exit channel [59]. Gusarov and Nudler [60] proposed that NusA weakens RNA binding to the upstream bindings elements (UBS) in the channel, allowing formation of the RNA stem-loop that induces termination at intrinsic terminators. However, direct binding of NusA-NTD to RNA:RNA duplexes in the exit channel has recently been demonstrated [61]. Thus, direct stabilization of RNA:RNA hybrids in the exit channel might instead be responsible for NusA stimulation of pausing and intrinsic termination.
The NusABEG/ȜN antitermination complex forms at the NUT sequences of Ȝ nascent transcript. These sequences lie in the phage chromosome between the ȜpL and Ȝ pR promoters and the first termination signals in their operons. NusA binds to the NUT SPACER sequence within NUT. However, NusA binding to RNA is dependent on its association with TEC. NusA binds to TEC via two distinct domains, the NTD and an acidic domain in the CTD (AR2). The C-terminal domain (CTD) of the RNAP Į-subunit (ĮCTD) interacts with the acidic CTD 2 (AR2) of NusA, releasing the autoinhibitory blockade of the NusA S1-KH1-KH2 motif and allowing NusA to bind RNA. The solution conformation of the AR2:ĮCTD complex shows that the ĮCTD residues that interface with AR2 are identical to those that recognize UP promoter elements. This is consistent with a role for NusA in transcription initiation of operons carrying UP elements, although evidence for such an activity has yet to be unearthed [62].
Earlier studies suggested that the binding of NusA-NTD and ı70 to RNAP were mutually exclusive. However, recent structural data shows association of NusA-NTD and the ȕ-flap tip helix, a site distinct from that of the major ı70 region 2 binding site, the CH region of the ȕ' subunit. Nevertheless, NusA might compete with the weak binding of ı70 region 4 to the ȕ-flap tip [63].
Unlike NusG, complete deletions of NusA cannot be constructed. A NusA truncation that retains the NusA-NTD can be introduced into strain MDS42, which lacks all horizontally transmitted elements, including the cryptic rac prophage [64]. This is explained by the finding that the NusA-NTD has biochemical activity; NusA-NTD by itself is necessary and sufficient for enhancement of transcriptional pausing. The other, dispensable, NusA domains provide additional, interactions with TEC and are required to stimulate intrinsic termination [61].

DksA
DksA was originally isolated as a suppressor of a chaperone mutation, dnaK. It was then shown to disrupt RNAP open complexes at rrn promoters under the influence of ppGpp and NTP [65]. DksA and GreA/B belong to a family of coiled-coil proteins that bind within the secondary channel of RNAP. Despite structural similarities to GreA/B, DksA cannot induce RNAP to cleave RNA in backtracked RNAP.
The critical difference between DksA and GreA/B lies in a few residues at the tip of the coiled coil. These residues contact the RNAP active center [66]. There is physiological interplay between the two functions. Thus, microarray analysis indicates that many genes are similarly regulated by DksA and GreA. GreA overproduction can suppress a dksA growth defect. At other genetic loci, however, DksA and GreA act oppositely. The biochemical basis of these interactions remains to be elucidated [67]. Like DksA, GreA can act at promoters, where it facilitates promoter escape. In particular, this activity of GreA strongly stimulates expression of ribosomal protein operons and the tna operon [68].
The fact that ppGpp inhibits RNA chain extension prompted experiments to test if DksA also acted on TEC in vitro. Although wild-type DksA has little or no effect on the rate of RNA synthesis with wild-type RNAP, a DksA mutant with enhanced affinity for RNAP slows elongation in a ppGpp-independent fashion, although this effect is stimulated by ppGpp. Similarly, wild-type DksA retards RNA synthesis by an RNAP mutant with increased sensitivity to DksA, again independently of ppGpp. The template used in these studies lacked paused sites, suggesting that DksA does not slow transcription elongation by stimulating pausing. Finally, DksA stimulates termination at the intrinsic rrnB T1 terminator [69].
In vivo, mutational studies implicate DksA in preventing transcription-replication conflicts. DksA protects cells against UV and other DNA damage, which inhibit transcription elongation [70]. DksA prevents replication arrest in amino acid-starved cells via effects on transcription elongation [71]. Amino acid starvation, which stalls translation, arrests DNA replication in the absence of DksA [71]. This is consistent with the idea that TEC can uncouple from stalled ribosomes and, if not removed by Rho, will backtrack and form a barrier to replication [39,64]. Tehranchi et al. [71] propose that DksA prevents backtracking of uncoupled TEC (rather than resolving backtracked TEC), and, therefore, suppresses replisome clashes. The mechanism by which DksA might accomplish this reaction remains undefined.
A genome wide survey of TEC occupancy in the presence or absence of DksA supports the notion that DksA suppresses replisome clashes by acting on backtracked TEC. ChIP-chip experiments reveal that DksA is enriched both at promoters and in downstream regions, colocalizing with RNAP across the entire chromosome. DksA suppresses TEC stalling induced by amino acid starvation globally, possibly by blocking backtracking [72]. An alternate interpretation, that DksA removes uncoupled TEC ahead of the replisome, suggested by in vitro studies, has not been ruled out [69]. Genetic evidence further complicates the picture. Thus the sensitivity of dksA mutants to the DNA cross-linker, mitomycin C, is suppressed by a second mutation in greA [73]. Since both mutations are proposed to increase backtracking or stabilize the backtracked TEC, it is difficult to rationalize this suppression pattern. Reconciling these in vitro and in vivo observations will be, we are afraid, the task of future generations.

Ȝ N Antitermination
Ȝ N suppresses transcription termination in vivo specifically on the Ȝ chromosome. It is directed to TEC by binding to the NUT sequence of the Ȝ nascent transcript via its N-terminal arginine rich motif (ARM), and remains attached to TEC during transcription of the Ȝ early genes. Ȝ N and NusA, B, E and G form an antitermination complex that modifies TEC. The mechanism of action of Ȝ N remains controversial. In vitro, Ȝ N alone can accelerate transcription elongation and suppress transcription termination at Rho-dependent and intrinsic termination sites. However, E. coli NusA factor significantly improves Ȝ N efficiency. Gusarov and Nudler [60] found that Ȝ N has no effect on RNA:DNA hybrid stability in TEC, with or without NusA, suggesting that these factors do not suppress hairpin formation and intrinsic termination by strengthening the hybrid. However, Parks et al. [74] concluded that Ȝ N protein reduces transcriptional slippage within actively growing cells and in vitro. This result suggests that Ȝ N does, in fact, stabilize the RNA:DNA hybrid, particularly at the 5' end. Stabilization is proposed to disfavor dissociation of RNA from the DNA template, thereby suppressing both termination and slippage. In contrast, Gusarov and Nudler [60] suggest that Ȝ N blocks hairpin formation by sequestering the ascending portion of the RNA stem, prohibiting annealing with the descending portion. Clearly, how Ȝ N modifies transcription elongation remains an open question.

HK022 Nun-Mediated Transcription Arrest
HK022 blocks the growth of phage Ȝ by arresting transcription at pause sites distal to the Ȝ nut elements. The arrested TEC is released by the host Mfd factor, thus prematurely terminating transcription on the Ȝ chromosome [75]. Nun binds NUT RNA with its ARM motif, whereas the C-terminal region of Nun interacts with TEC. Other than the effect of Nun on Ȝ growth, no other biological function has been described for the protein. Lytic growth of HK022 is unaffected by nun mutations, and HK022 nun mutants lysogenize with normal frequency. The specificity of Nun exclusion is unique; other phage exclusion systems are active against a broad range of superinfecting phage [60].
It has been suggested that the function of the Ȝ NUT RNA is to tether Nun or Ȝ N, increasing the local concentration of the protein near TEC. Indeed, nut is dispensable for function of both proteins in vitro (see below; [74,76]). Furthermore, Nun overproduction is toxic to E. coli, although Ȝ NUT sites are not encoded in the bacterial chromosome [77]. Toxicity is related to transcription termination, since host RNAP and Nun mutations that block Nun termination also suppress cell killing [77,78]. In vivo, Nun arrest requires the four E. coli auxiliary transcription elongation factors, NusA, NusB, NusE and NusG.
Though these factors are not essential for Nun arrest in vitro, they enhance Nun specific activity, reducing the concentration of Nun required to block elongation [78].
TEC paused by nucleotide deprivation in vitro is a substrate for subsequent Nun arrest, although the site of arrest differs depending on the location of the pause [79,80]. Mutational analysis of the Nun C-terminus indicates that a penultimate aromatic residue (W108) is required for Nun arrest (Figure 3, top). The Nun C-terminus crosslinks to template DNA about 9 bp promoter-distal to the RNAP active center. This is compatible with the idea that Nun arrests transcription by anchoring TEC to the DNA template, perhaps via intercalation of the W108 residue. Two neighboring basic Nun C-terminal residues, K106 and K107 (Figure 3, top) are required for efficient arrest. They are thought to aid Nun binding to the negatively-charged DNA template. The Nun mechanism of action was tested on defined TEC scaffolds consisting of DNA template and non-template strands and RNA complementary to the template strand. These TECs differed in the length and the sequences of the RNA primer. Importantly, the scaffolds included no Ȝ DNA or RNA sequences. Nun arrested all TECs tested that carried an RNA:DNA hybrid 9 bp or longer. For each TEC, Nun-mediated arrest occurred at a specific site, corresponding to an intrinsic pause site [77,81,82]. Nun-arrested TEC were found in either the pretranslocated or the posttranslocated state. Nun arrests transcription elongation by preventing movement of TEC from one register to the other [79,80].

Conclusions
Our understanding of transcription elongation has accelerated over the past few years. To a large extent, this reflects the application of structural biology to the elongation reaction, which, in turn, has informed the genetics, allowing construction of relevant mutant RNAPs and auxiliary factors. Nevertheless, we lack a satisfactory mechanistic explanation for the activities of many transcription factors, e.g., Rho, DksA, Ȝ N and HK022 Nun remain obscure. This is likely to be so only in the short term, we expect, as more sophisticated structural and biochemical approaches are applied to determining how genes are transcribed.

Acknowledgments
The authors wish to thank Seth Darst and Arkady Mustaev for helpful comments. Nun protein sequence and mutants.
Ȝ N protein sequence: