Initial Events in Bacterial Transcription Initiation

Transcription initiation is a highly regulated step of gene expression. Here, we discuss the series of large conformational changes set in motion by initial specific binding of bacterial RNA polymerase (RNAP) to promoter DNA and their relevance for regulation. Bending and wrapping of the upstream duplex facilitates bending of the downstream duplex into the active site cleft, nucleating opening of 13 bp in the cleft. The rate-determining opening step, driven by binding free energy, forms an unstable open complex, probably with the template strand in the active site. At some promoters, this initial open complex is greatly stabilized by rearrangements of the discriminator region between the −10 element and +1 base of the nontemplate strand and of mobile in-cleft and downstream elements of RNAP. The rate of open complex formation is regulated by effects on the rapidly-reversible steps preceding DNA opening, while open complex lifetime is regulated by effects on the stabilization of the initial open complex. Intrinsic DNA opening-closing appears less regulated. This noncovalent mechanism and its regulation exhibit many analogies to mechanisms of enzyme catalysis.


Introduction
Transcription is the first step of gene expression and is therefore one of the most fundamental processes of life. All living organisms, as well as many viruses, encode at least one RNA polymerase (RNAP) enzyme that synthesizes an RNA copy of the template DNA; the core structure and many mechanistic and regulatory elements are shared by bacterial, eukaryotic and archaeal RNAPs (reviewed in [1,2]). Bacterial RNAPs are composed of a core enzyme, which carries out RNA synthesis, and a specificity (ı) subunit for recognition of promoter DNA sequence and subsequent events of initiation (Figures 1 and 2). Specific binding to promoter DNA forms an initial closed complex and sets in motion a series of large conformational changes which bend the downstream duplex DNA into the active site cleft of RNAP and then open the DNA to form a transcription bubble, placing the template DNA strand into the active site. At some (but not all) promoters, subsequent conformational changes in RNAP and the nontemplate strand stabilize this initial open complex (reviewed in [3]). Many steps of this process are highly regulated by promoter DNA sequence, accessory protein factors and small ligands, nucleotide concentration, temperature, salt and solute concentrations, and other environmental variables; aspects of regulation are reviewed in [1,[4][5][6][7]. Schematic representations of the subunits of RNAP core, ı 70 , and promoter DNA. RNAP: Į2: cyan; ȕ and ȕ': gray; Ȧ: black. ı regions: as shown. Promoter: UP element: cyan; í35 element: blue; extended í10: red; í10 element: yellow; discriminator: orange; transcription start site: green; DNA downstream of the transcription start site: gray. Linker regions in Į and ı subunits are shown as springs. Nontemplate strand sequences of a "consensus" and ȜPR, T7A1 and rrnB P1 promoters are shown below; missing bases are indicated by dashes. A detailed understanding of the mechanism of initiation and its regulation is centrally important throughout biology, with many applications in biotechnology, pharmacology, and medicine. Determination of noncovalent mechanisms like that of open complex formation and stabilization require both kinetic and structural studies. High resolution structural studies provide detailed information about reactants and products, as well as any intermediates that can be trapped and stabilized [8][9][10][11][12][13][14][15][16][17], but intermediates in open complex formation have been challenging to obtain and characterize in this way. Kinetic studies are used to investigate effects of promoter and RNAP variants, regulators, and regulatory variables [18][19][20][21][22], as well as to establish reaction conditions necessary to obtain transient high concentrations (bursts) of key intermediates for characterization by footprinting and other methods [23,24]. Real-time DNA footprinting provides snapshots of the population of intermediates (and any reactants and products) at any given time, as well as kinetic information about the change in the distribution of these species with time [23,[25][26][27]. Ensemble and single-molecule experiments with fluorescent probes can provide unparalleled combinations of structural, dynamic and kinetic-mechanistic information [21,[28][29][30][31][32][33][34][35].
Studies with Escherichia coli RNAP reveal that the early steps of open complex formation, including initial specific binding to the promoter and some or all of the coupled conformational changes that bend DNA into the cleft, are often rapidly reversible in comparison to the slower "isomerization" step that includes DNA opening and is the rate-determining step of open complex formation. The forward direction of the subsequent large conformational changes that stabilize the initial open complex are faster than the "bottleneck" opening step, and hence must be investigated by dissociation kinetic and mechanistic studies starting with the stable open complex. In the dissociation direction, these conformational changes are reversible on the time scale of the rate-determining DNA closing step. These and other aspects of this during open complex formation, altering relative positions of the modules. Wider clefts, resulting largely from different positioning of the "clamp" module, have been observed in a paused complex [50] and a complex with transcription factor Gfh1 [51]. Opening of the clamp has been proposed to underlie the mechanism of action by the stress alarmone ppGpp [52,53]. Several antibiotics inhibit modular movements of RNAP by binding to or near these switch regions [54][55][56].
Opening of the RNAP clamp likely permits loading of the duplex DNA during initiation. Since the hydrated DNA duplex is about 25 Å wide, early models of open complex formation proposed that DNA opening occurred outside the cleft before entry of the strands [13,37,57]. However, the extended downstream footprint (to +20, relative to the transcription start site) observed for advanced closed complexes demonstrates that the downstream duplex enters the active site cleft before it is opened [27,[58][59][60]. A more open cleft obviates the need for separating the strands prior to entry.

E. coli ı 70 Promoter Recognition
In vivo, initiation rates vary at least 10,000-fold for different promoters [61,62]. Rates of open complex formation (at a specified [RNAP]) and dissociation in vitro both span a similar range determined by the sequence and structure of the promoter DNA [61,62]. For a given promoter sequence, changes in temperature, salt, and solute concentrations [24,[63][64][65][66], as well as additions of protein factors and ligands, can affect these kinetics by 10-1000-fold or more.
Promoter elements have been defined structurally, genetically, and/or functionally (summarized in Figure 1). What promoter regions are most important for which steps in the mechanism? As an extension of the bipartite proposal of promoter function [67], a working hypothesis is that promoter sequence and structure upstream of (and including at least part of) the í10 element direct the steps of initial binding of RNAP and subsequent conformational changes that culminate in bending of the downstream duplex into the cleft. These steps precede the central DNA opening step, which opens approximately 13 bp (í11 to +2 at ȜPR) of the promoter DNA. Collectively these steps determine the rate of open complex formation. For some (but not all) promoters, promoter elements in and downstream of the í10 element direct post-opening steps of the mechanism (in particular open complex stabilization) that determine the lifetime of the open complex. All of these steps are discussed in subsequent sections.
The farthest upstream sequence-specific interactions between RNAP and promoter DNA are in the UP element region from approximately base í40 to í60 (Figure 1, cyan; reviewed in [68]). UP elements are phased AT-tracts recognized by the C-terminal domains of the Į subunits (ĮCTDs); the narrow minor groove at these sequences favors specific ĮCTD binding [69]. UP elements were first noted for their ability to increase transcription from the rRNA promoters, and a consensus UP element sequence determined using mutational analysis increased transcription 330-fold in vivo [70,71]. The entire UP element consists of two subsites, proximal and distal, one for each ĮCTD; promoters may have one or both, although the distal UP element tends to function nearly as well as a full UP element [72]. Because of the curvature generally observed at phased A-or T-tracts, it was hypothesized that the main function of UP elements was to bend the DNA. However, not all UP elements display significant curvature [73] as even single base disruptions in an A-or T-tract can practically abolish bending [74].
At ȜPR and T7A1, real time and low temperature DNase and hydroxyl radical (OH) footprinting of closed complexes reveals that recognition of the í35, í10, and UP elements induces bending of upstream DNA at í38 and í48 [27,59,75] and wrapping of the region from ~ í60 to í80 on RNAP [25][26][27]59,75,76]. Structural evidence for bending at the í35 element is provided by [43,77], and a discussion of functional bending by the ĮCTDs is provided in [68]. These upstream interactions are very important for efficient open complex formation at ȜPR and lacUV5 (see below) [75,78].
The í35 element (Figure 1, blue) has the consensus sequence 5'-TTGACA-3' [79] with the í35T, í34T, and í33G being the most highly conserved [80]. These bases interact via the major groove with a helix-turn-helix motif of ı4.2 (Figures 1 and 2, blue), which bends the DNA approximately 36° in cocrystal structures of the í35 element and Thermus aquaticus ı4.2 [13,43] and a recent initiation complex structure with E. coli holoenzyme [77]. Strong DNase I footprinting enhancements suggestive of bending are also observed at the upstream end of the í35 element at ȜPR [75]. This strong bend observed in solution may be due to ı4 interacting with the ĮCTDs [12,81]. At some promoters, ı4 interacts with activators such as ȜCI [82] and CAP [11,[83][84][85], influencing multiple steps of transcription initiation. This region can also interact with the core RNAP to mediate recruitment of elongation factors [86].
There is no consensus sequence for the majority of the spacer between the í35 and í10 elements, but there is a consensus length, which is dictated largely by the spacing between ı4.2 and ı2.3 [13]. The most common spacer length for ı 70 promoters is 17 bp [79,87,88], and promoters with 17 bp spacers produce greater amounts of transcript in multi-round assays than otherwise identical promoters [89][90][91]. The length and extent of bending (as predicted by molecular modeling) of the spacer region modulate the effect of ı1.1 on transcription initiation kinetics and the structure of the open complex [92]; the underlying mechanism is unknown, but the authors speculate that this may be due to ı1.1 acting as a "gatekeeper", discriminating between promoter and non-promoter DNA based on the trajectory of the spacer from the í35 element to the í10. Recent structures of E. coli ı 70 initiation complexes formed with 4-5 nt nascent RNAs reveal that rotation of ı4 is required to accommodate different promoter spacer lengths [77]. A region within the spacer defined as the "Z-element" (bases í24 to í18) was also identified in a recent study as interacting with the ȕ' zipper (residues [40][41][42][43][44][45], increasing the amount of abortive transcription from a synthetic promoter [93]. It is not known, however, whether this interaction is sequence-specific or simply specific to a certain spacer conformation or trajectory. At some promoters, an "extended í10" sequence (TGn; Figure 1, red) increases activity through specific contacts with ı3.0 [44,94] (Figures 1 and 2, red), perhaps by increasing the lifetime of the open complex [95]. The TGn motif was found in 20% of the 554 promoters identified in one study [87], with 44% of these promoters having a G at í14. The TGn motif may be particularly important at promoters with weaker í35 elements [43] or longer spacers [87]. In E. coli initiation complex structure, the extended í10 element is recognized by the insertion of two perpendicular helices of ı2 and ı3 into the major groove [77].
The í10 element (Figure 1, yellow), with an all-AT bp consensus sequence (5'-TATAAT-3') constitutes the upstream half of the region opened by RNAP, with opening likely initiated near its upstream end. The transcription bubble forms downstream of base í12 (for ȜPR and other promoters with six bp discriminators; see Figure 1). This position remains base paired [77,96], with the E. coli ı 70 Q437 likely making a key sequence-specific contact [97] with base í12A of the template strand [10]. After opening, the í10 region of the nontemplate strand interacts with conserved residues of ı, with the nearly invariant bases í11A and í7T in pockets of ı 70 [10,17,98,99]. Residues of both template and nontemplate strands interact with ı2.4 [10,17]. Modeling and biochemical data suggest that these contacts require prior strand separation [10]. It remains unresolved whether recognition of this region requires the DNA to be single-stranded.
The base í11A is thought to nucleate DNA opening; alanine substitutions of aromatic residues in the í11A binding pocket on ı2.3 impair open complex formation, but not binding to duplex DNA [100]. Flipping of base í11A into its pocket may thus nucleate bending of the DNA into the cleft before opening at some promoters; alternatively, other RNAP-DNA interactions may first bend the DNA, nucleating í11A flipping [10]. A recent comparative kinetic study of all 4096 í10 variants [22] finds that RNAP interactions with í7T play a relatively stronger role in open complex formation kinetics, suggesting a model in which bubble nucleation occurs at í7T, or í7T flipping stabilizes the bubble in the transition state, or both. While í11A and í7T are the most conserved bases of the í10 element [79,88,101], they are not absolutely required for relatively fast initiation kinetics, particularly if the í10 element is sufficiently AT-rich [22]. A recent structure shows that E. coli ı E uses a similar base flipping mechanism to recognize, and possibly nucleate opening at, the í10 element [102].
The discriminator region between the í10 element and the start site ( Figure 1, orange) is involved in regulation of open complex lifetime. Its upstream end interacts with ı1.2 [9,17,77,95,103] (Figures 1 and 2, orange). Most discriminators are 6-8 bases in length [88]. Mutational analyses [104][105][106] indicate that lifetime decreases as discriminator length increases from 6 to 8 bases. A bioinformatic analysis of E. coli promoters [107] found that the frequency of discriminators also decreases with increasing length from 6 to 8 bases. These observations together indicate that many E. coli promoters form long-lived open complexes. More recent efforts to comprehensively map the transcription start sites in E. coli [108] will no doubt lead to elucidation of the in vivo importance of the sequence of this region. Structural data indicate that for a six-base discriminator, í6G on the nontemplate strand flips into a pocket on the surface of ı1.2 [17]. The í5 nontemplate strand base is also very important for open complex stability, with a G being relatively stabilizing and a C being relatively destabilizing [95]. Similarly, ı A from T. aquaticus binds most favorably to a 5'-GGG-3' sequence immediately downstream of the í10 element [109]. The rRNA promoters generally have a C on the nontemplate strand two bases downstream of the í10 element [110], leading to short open complex lifetimes; this sequence element is important for their regulation in vitro and in vivo [95].
A recent crystal structure of a model open complex identifies a "core recognition element" (or "CRE", bases í4 to +2 on the nontemplate strand) recognized by the ȕ subunit [17]. Base +2G is flipped into a pocket, increasing the lifetime of an RNAP-fork junction complex containing a consensus í10 element and strong discriminator [17]. Each CRE base except í1 appears to be specifically recognized, and substitutions in the corresponding ȕ residues lead to defects in transcription initiation. No consensus sequence of CRE has been detected [87] and thymine bases in this region are generally observed to be reactive to permanganate, implying that they are exposed to solvent. However, in vivo footprinting argues that CRE interacts with RNAP during elongation [111] and base-specific interactions involving the CRE have been implicated in pausing [112].
The most common transcription start site base is A G > T >> C [106]. Subsaturating concentrations of the initiating nucleotides affect the kinetics of initiation, just as the concentration of any substrate affects the velocity V of an enzyme-catalyzed reaction for subsaturating concentrations (where V < V max ). In addition, binding of initiating NTP stabilizes the short-lived open complex at rRNA promoters [113], shifting the distribution of promoter complexes from closed to open in a NTP concentration-dependent manner.

Initial Binding and Subsequent Conformational Changes to Form and Stabilize the Open Complex at Eı 70 Promoters
Ensemble and single-molecule kinetic-mechanistic studies with wild-type or variant RNAP and promoter DNA, DNA footprinting in real time or at low temperature, crosslinking and fluorescence studies, and high resolution structures of initial and final states provide much information about the sequences of conformational changes and intermediate complexes on the pathway that forms the initial open complex and converts it (at some but not all promoters) to a more stable, longer-lived open complex ( Figure 3). Initial specific binding to the promoter sets in motion conformational changes in which the RNAP molecular machine operates on promoter DNA to bend, wrap, and open the duplex and stabilize the open complex, with mobile regions of RNAP playing key roles. The rates and equilibria of these conformational changes are functions of promoter sequence, solution conditions, and regulatory factors or ligands. Rates of open complex formation and lifetimes of the most stable open complex differ by three to four orders of magnitude or more for different promoters under typical in vitro transcription conditions. In this section, we qualitatively discuss this pathway. In the following section, we discuss the interpretation of the experimentally determined rate and equilibrium constants for open complex formation and dissociation in terms of these conformational changes.

The Promoter Search and Initial Specific Binding to Form RPC
During initiation, RNAP holoenzyme first searches for and specifically recognizes promoter DNA, forming an initial closed complex generally called RPC (Figure 3). The bimolecular rate constant for formation of a closed complex between Eı 70 and rrnB P1 promoter on a short (204 bp) DNA is 2 × 10 8 M í1 ·s í1 [71]; that for a closed complex between Eı 54 RNAP and an 853 bp fragment containing the S. typhimurium glnAp2 promoter is 2.1 × 10 7 M í1 ·s í1 [21]. These provide lower bounds on the rate constant for RPC formation at these promoters, which are one or two orders of magnitude less than the theoretical three-dimensional (3D) diffusion limit (~6 × 10 9 M í1 ·s í1 for equal-sized reactants). Is the rate of formation of the initial specific complex RPC determined by 3D diffusion of RNAP and promoter DNA, or by nonpromoter binding followed by 1D diffusion (i.e., sliding) and/or hopping of RNAP which can increase the rate constant above the 3D limit? Although some early studies (including [114][115][116][117][118]) argued in favor of RNAP sliding, recent single-molecule studies provide evidence against it [119,120], while suggesting a role for RNAP hopping [121]. Since thermodynamics is path-independent, sliding or hopping should not affect the equilibrium constant for forming RPC. This equilibrium constant and not the second order rate constant for forming RPC is generally the significant quantity for the kinetics of open complex formation (see Section 5). Hence sliding or hopping (if they occur) will increase the rate of open complex formation and initiation only if RPC formation is irreversible. Nonpromoter binding of RNAP [122] can have either a competitive or facilitating effect on promoter binding, depending on DNA length and solution conditions.
The initial specific closed complex RPC ( Figure 3) and/or other early specific promoter complexes have been characterized by real time OH and permanganate footprinting of the T7A1 promoter [25,26] and by low temperature DNase, OH and permanganate footprinting of other promoters [58,[123][124][125][126][127]. Footprinting reveals that RPC is closed and that RNAP is bound to the UP element, í35, spacer and í10 regions, protecting the upstream DNA from í55 to í5. Since no sites of enhanced DNase reactivity are observed in RPCs [58,[123][124][125][126][127] the downstream duplex is presumably unbent. At ȜPR, only more advanced closed complexes than RPC have been detected, indicating that RPC must be significantly less stable than these extended-footprint closed complexes under the conditions studied.

Upstream Bending and Wrapping to Form More Advanced Closed Complexes
The first conformational changes induced by RPC formation may occur upstream of the í35 region. DNase footprinting of ȜPR closed complexes reveals enhanced reactivity at í38 on the template strand and within the UP element region (í48 on the template strand, í45 on the nontemplate strand), indicating DNA bending or other distortions at these positions [75]. Periodic protection of the upstream DNA from OH attack is observed upstream to approximately base í80 [27], and downstream to base +20. This indicates that the far upstream promoter duplex is lying on the "back" surface of RNAP (see Figure 3 and [27]).
The mechanism of forming this bent, wrapped interface is unclear. Since the ĮCTD are on flexible tethers, extension of the tethers allows them to bind specifically to the UP element or nonspecifically to DNA upstream of the í35 element in RPC (see Figure 3). Specific binding to the proximal UP element brings one ĮCTD (binding site centered at í42) into contact with region 4 of ı 70 ; mutation of ı region 4 residues within this ı4-ĮCTD interface reduces transcription from some, but not all, UP element-containing promoters [81,83]. A second ĮCTD binds to the distal site, centered at base í52 [72]. There is no evidence that the two ĮCTD interact in open complex formation [68], but efficient transcription from promoters having only a distal UP element requires both ĮCTDs, suggesting that the second ĮCTD must somehow affect the overall stability of the complex [68,72]. In the model of Davis et al. [27], formation of these interfaces bends the DNA between í35 and í60 nearly 100°, driving presumably nonspecific interactions between RNAP and DNA upstream of base í60.
Interactions of the upstream promoter DNA with the ĮCTD are observed to be very important for efficient open complex formation at ȜPR, which appears to have a distal UP element [75], and for efficient initiation at lacUV5, which has no UP element [78]. Truncation of ȜPR at base í47 (removing the specific distal UP element sequence and upstream DNA and, thus, probably most of the effect of the UP element [72]), or of lacUV5 at base í45 reduces the rate of the isomerization/DNA opening step (see Section 5) by 1.5-2 orders of magnitude; deletion of the ĮCTD and mutation of a single residue which eliminates ĮCTD-DNA binding have similar effects [78,128]. Smaller but significant effects of truncations upstream of í60 on the isomerization/DNA opening step are also observed [78,128]. Because the strong dependences of the kinetics on upstream DNA length are similar for both promoters, only nonspecific interactions with the ĮCTD appear necessary to generate these effects.
Far upstream interactions persist in the open complex at some (perhaps many) promoters. For lacUV5 open complexes, crosslinks between the ĮCTDs and nonspecific upstream DNA are observed even to base í90 [129]. Atomic force microscopy demonstrates that at ȜPR, DNA is extensively wrapped around RNAP in the open complex [76].
In the conversion of RPC to I1,L ( Figure 3) the downstream DNA must be bent by at least 90° at the upstream end of the í10 element to enter the cleft [16,60]. What bends the duplex is not known. The í11A base flips out of the stacked duplex and into a pocket of ı 70 region 2 [9,10,17]. If base flipping occurs before this region is bent, it would introduce a point of greater flexibility into the duplex, resulting in spontaneous bending of the duplex and its capture by the cleft. Alternatively, bending of the í10 region by RNAP may occur first, destabilizing the stacking and resulting in base flipping. If the interaction of ı 70 region 2 with the unbent í10 region in early closed complexes were limited to the í12 position, this interaction might either induce flipping of the adjacent í11A base or dictate the site of bending and subsequent base flipping after a downstream interaction [10,97].
In-cleft elements like ı1.1 and downstream mobile elements (DME; see below) of free RNAP holoenzyme are positioned ( Figure 2) to prevent entry of nonpromoter DNA into the active site cleft. These elements of RNAP appear to be part of an extensive, sophisticated network of upstream and downstream interactions that regulate the access of the downstream duplex to the cleft to form I1,L. Conversion of RPC to I1,L appears to involve a series of closed intermediates with different extents of downstream interaction, in which the extent of bending into the cleft may be coupled to the extent of upstream bending and wrapping discussed above. This regulatory network responds to promoter sequence and structure, as well as factor binding and solution conditions. Many of these same elements may also be involved in the regulation of open complex lifetime (see below).
Several lines of evidence exist for this network. As discussed above, truncation of upstream DNA greatly reduces the rate of the isomerization step of open complex formation. In addition to this, truncation of ȜPR at í47 shortens the downstream DNase footprint boundary of the advanced closed complex ensemble from +20 to +7 on the nontemplate strand and +2 on the template strand, indicating that only the upstream half of the downstream duplex (í10 to +2/+7) is inserted in the cleft in this closed complex [75].
Similarly, a RNAP variant lacking ı1.1 forms a closed complex footprint at early times at ȜPR, with a downstream boundary of ~ í5 [47]. Thus, upstream DNA interactions and ı1.1 facilitate bending of the downstream duplex into the cleft.
Similarly, at rrnB P1, the complex formed at equilibrium in the absence of nucleotides at 12 °C has a downstream boundary which is approximately +1 [130,131]. This footprint boundary requires that there be some bending at the í10 element, but not enough to fully load bases í10 to +20 into the cleft. What could be the reason for this partial bending in these complexes? We propose that impediments (ı1.1 and/or DME) to full entry of the downstream duplex into the cleft have not been fully removed.
At ȜPR, the series of conformational changes initiated by the binding interactions in RPC and culminating with bending the downstream duplex fully into the active site cleft are prerequisite for subsequent efficient DNA opening [27,64]. Upstream bending and wrapping are clearly targets of regulation by factors and ligands. Some transcription factors (e.g., CAP, IHF, Fis, and other nucleoid associated proteins) bind in or near the í40 to í60 region, bending DNA while also replacing or affecting the interaction of this region with the ĮCTD [6,7,68,133].

DNA Opening and Closing in the Cleft
At ȜPR opening of the entire 13 bp transcription bubble occurs in the rate-determining isomerization step that forms the first open intermediate, I2 (k2 in Figure 4) [23]. As for DNA melting in solution in the absence of RNAP, this step is highly temperature dependent, increasing 1000-fold between 7 °C and 42 °C with an activation energy of 34 kcal [60,66]. However, unlike in solution, the DNA opening step on RNAP is only weakly dependent on the concentrations of salt and solutes, consistent with a scenario in which it occurs in the local environment of the cleft [64]. The process of transcription bubble closing (k-2 in Figure 4), which is rate determining in dissociation, is even less affected by salt and solute concentration [24] and, at ȜPR and T7A1, does not appear to be strongly affected by DNA sequence or any modification of RNAP tested thus far, including large deletions [18].
Because the rate and equilibrium constants for transcription bubble opening and closing are only weakly (if at all) dependent on salt and solute concentrations, we propose that the regulation of the rate of open complex formation (and the rate of initiation in cases when open complex formation is the bottleneck) occurs primarily in the steps that form and remodel the closed complex, but not in the actual DNA opening step. However, whether this mechanism is universal is still a subject of some debate. Comparison of the timing of permanganate reactivity and downstream DNA protection at T7A1 has led to the proposal that at T7A1 opening occurs before the DNA is bent into the active site cleft [25].

Open Complex Stabilization
The conversion of the relatively unstable initial open complex (I2) to the highly stable open complex (RPO) at ȜPR involves conformational changes in DNA and RNAP. Comparison of permanganate footprints of ȜPR I2 and RPO indicates that the discriminator region of the nontemplate strand is repositioned in the cleft, becoming more reactive in RPO [23,95]. The large dependences of the equilibrium constant K3 for this step on urea and salt concentration indicate the folding/assembly of approximately 120 RNAP residues of the jaw and other DME on the downstream duplex [134]. Comparative quantitative studies with ȜPR truncated at base +12 and a ȕ' jaw deletion mutant of RNAP indicate that the ȕ' jaw DME [127] ( Figure 2, brown) and ȕ' sequence insertion 3 DME (also termed ȕ' insertion 6; Figure 2, green) are involved in the assembly-DNA binding process that converts I2 to the much more stable RPO at ȜPR. In this model, supported by studies using RNAP and DNA heteroduplexes of varying lengths [29][30][31]135], the DME assemble on downstream DNA, stabilizing the open complex. Recent evidence suggests that the interaction of DNA between positions +4 and +6 with switch regions 1 and 2 may trigger cleft closure and DME folding/assembly and clamping [29].  (I1) intermediates, the forward rate constant k2 of the isomerization step that includes DNA opening, and the excess RNAP concentration. Likewise, kd is determined by the late steps of the mechanism (boxed in green), including the rate constant k-2 for DNA closing and the equilibrium constant K3 for stabilization of the initial open complex I2 to form longer-lived I3 and/or RPO complexes [65,66].
Kinetic dissection of open complex stabilization at ȜPR led to the discovery of a third on-pathway intermediate designated I3 [24]. Formation of I3 from I2 involves the majority (approximately two-thirds) of the folding and assembly of RNAP DME involved in converting I2 to RPO. The extent of this assembly is intriguingly similar to that observed for the open complex formed by a ȕ' jaw deletion variant, which may therefore be an analog of I3 [134]. Another potential analog, based on similarity in lifetime and downstream interactions, might be the open complex formed at T7A1 [18,136]. The lifetime of the open complex formed with rrnB P1 (~1 s) [95] is intriguingly similar to that of I2 [24], consistent with the idea that the rrnB P1 sequence (in particular the weak discriminator and short spacer) precludes the stabilizing interaction from cleft closing and DME assembly induced by promoters like ȜPR and T7A1. Further experiments are required to explore the roles of promoter elements in the mechanism, and the potential relevance of I2, I3 and other forms of the open complex to transcription initiation and its control.

Forming and Stabilizing the Open Complex; Regulation of the Kinetics by Promoter Sequence, Length and Other Variables
Kinetic studies are essential to any determination of mechanism, together with structural information about intermediates. Among other insights, kinetic data provide evidence for the minimum number of key intermediates and indicate how to trap or populate them. Knowledge of kinetics and mechanism is necessary to understand how promoter sequence, regulatory factors, ligands, and solutes/salts exert their effects, and to design inhibitors, including antibiotics.

Kinetics and Mechanism of Open Complex Formation
The kinetics of open complex formation in excess RNAP at concentration [R] are found to be single exponential with a rate constant (kobs), which is a hyperbolic function of [R]. These kinetic data are well described by Mechanism 1 (see also Figure 4) with a reversible initial step (with equilibrium constant K1) that includes promoter binding and forms a closed intermediate I1, followed by a rate determining "isomerization" step (with rate constant k2) to convert I1 to the stable open complex (RPO) [137,138]. The observation of single exponential kinetics means that the species in the I1 ensemble rapidly equilibrate with one another and with free P on the time scale of its conversion to RPO [139]. Analysis of Mechanism 1 for this situation (see Figure 4, bottom) yields: where șI1 is the fraction of closed promoter DNA present as I1 complexes: The interpretation of K1 and k2 depends on the details of the I1 ensemble, as indicated for the particular case of two closed complexes below.

Kinetics of Open Complex Formation for ȜPR and ȜPR Variants: Interpretation of kobs
As an example relevant for analysis of the single-exponential kinetics of open complex formation by full-length (FL) and upstream truncated (UT) ȜPR promoters, consider the situation where the I1 ensemble consists of one early closed complex (designated I1,E) and the most advanced closed complex (I1,L). Since the kinetics of RPO formation are single-exponential, I1,E and I1,L are in rapid equilibrium with one another and with P on the time scale of conversion of I1,L to RPO. For this case Mechanism 1 is rewritten as: where kopen is the rate constant for the elementary, rate-determining DNA opening step involving I1,L. For Mechanism 2, kobs in excess RNAP is given by Equation (3): where șI1L is the fraction of closed promoter DNA present as I1,L complexes: Comparison of Equations (1)- (4) shows that the observed isomerization rate constant k2 of Mechanism 1 is interpreted using Mechanism 2 as: In words, K1 of Mechanism 1 is the overall I1 binding constant written in terms of total I1 concentration.
If the mechanism is more complicated than Mechanism 2, with additional on-pathway closed complexes in rapid equilibrium with one another on the time scale of DNA opening, Equations (5) and (7) increase in complexity, but the general principle still applies that the observed isomerization rate constant k2 is the product of kopen and the fraction of closed complexes that are I1,L. Only if all closed complexes are I1,L (i.e., K1,L >> 1) does k2 = kopen; in this case K1 = K1,EK1,L for Mechanism 2.
Here we apply Mechanism 2 to interpret the large differences in K1 and especially in k2 for open complex formation by full length (FL) and upstream-truncated UT-47 ȜPR promoters at 37 °C [75]. Isomerization is much faster for FL ȜPR (k2 = 0.66 s í1 ) than for UT-47 ȜPR (k2 = 0.03 s í1 ), but closed complex binding is weaker (K1 = 5.8 × 10 6 M í1 for FL; K1 = 2.6 × 10 7 M í1 for UT-47.) For FL ȜPR, low temperature equilibrium OH footprinting [59] and fast OH footprinting [140,141] reveal that the population of closed I1 intermediates is advanced with the DNA significantly protected to +20. Hence the observed isomerization rate constant k2 is probably not much less than the DNA opening rate constant kopen. Here, for purposes of illustration, we estimate kopen to be 1 s í1 at 37 °C. (The maximum rate of transcription initiation at 37 °C is known to be ~ 1 s í1 [61,62], indicating that kopen 1 s í1 .) From k2/kopen = 0.66 and Equation (5), K1,L = 2 so 2/3 of closed complexes are I1,L, consistent with the footprinting data on the I1 ensemble. From K1 = 5.8 × 10 6 M í1 and the interpretation of k2 in Equation (5) and Mechanism 2, K1,E = K1/3 = 1.9 × 10 6 M í1 . For FL ȜPR a plausible free energy vs. progress diagram for the steps of open complex formation is given in Figure 5A, which is drawn to scale for [R] = 30 nM, corresponding to K1,E[R] = 0.06 and a correspondingly small occupancy of I1,E at 30 nM RNAP. The barrier height for the rate determining opening step (rate constant kopen = 1 s í1 ) is set for (I1,L-I2) ‡ transition state decomposition frequencies of 10 3 s í1 in each direction [128].
For the UT-47 ȜPR variant, in which the distal UP element and far upstream DNA are missing, k2 is only 3% of the FL ȜPR value [75], even though there is no obvious reason for upstream truncation to affect the intrinsic opening rate constant kopen, and the closed complex binding constant K1 is 4.5 fold larger than for FL ȜPR, even though favorable distal UP element interactions are eliminated by truncation at í47. Should the large reduction in k2 for UT-47 be interpreted as a reduction in kopen or as a shift in the closed complex ensemble to less advanced species? Footprinting evidence supports the latter interpretation: in UT-47 I1 ensemble, the downstream boundary of DNase protection is at +2/+7, indicating that the downstream duplex is only partially bent into the cleft. Hence, the absence of far-upstream DNA results in a less advanced closed complex ensemble for UT-47 ȜPR.  [24,60] and UT-47 ȜPR [75] at 37 °C (see text for details). Free energies of FL and UT-47 ȜPR are set at the same value. For purposes of illustration, for both variants, we estimate that kopen = 1 s í1 and that (I1,L-I2) ‡ decomposition frequencies are 10 3 s í1 in each direction, and assume that k-2 for UT-47 ȜPR is the same as that determined for FL ȜPR [24].
At 37 °C, interpretation of k2 = 0.02 s í1 for UT-47 ȜPR [75] in terms of Mechanism 2 with kopen = 1 s í1 indicates that K1,L ~ 0.02, indicating that only 2% of the population of closed complexes are I1,L and 98% are I1,E, very different from the 66% I1,L, 34% I1,E distribution for FL ȜPR. From K1 = 2.6 × 10 7 M í1 [75] and this interpretation of k2, K1,E = K1/1.02 = 2.5 × 10 7 M í1 , 13-fold larger than that calculated for FL ȜPR. The free energy vs progress diagram for the steps of open complex formation with UT-47 ȜPR is given in Figure 5B, which is also drawn to scale for an excess RNAP concentration of 30 nM, corresponding to K1,E[R] = 0.76 and a much higher occupancy of I1,E, relative to both free promoter DNA and I1,L, than for FL ȜPR. As in Figure 5A, the barrier height for the rate determining opening step (rate constant kopen = 1 s í1 ) is set for a transition state decomposition frequency of 10 3 s í1 in each direction.
For UT-47, formation of I1,L, required for the subsequent opening step, is greatly disfavored by upstream truncation of the promoter DNA, resulting in a much reduced isomerization rate even though the intrinsic opening rate kopen is left unchanged. The very significant increases in K1 (and K1,E in Figure 5B) for UT-47 relative to FL ȜPR are difficult to explain, since they indicate that the far-upstream DNA actually destabilizes early closed intermediates like I1,E. Presumably only the ĮCTD are capable of extending sufficiently to interact with far upstream DNA in I1,E, but if such interactions are unfavorable it is unclear why they would occur. Published kinetic data for the effect of deleting ı1.1 on the kinetics of open complex formation show similar effects to those of upstream truncation: the isomerization rate constant is dramatically reduced while the closed complex binding constant increases [47]. In this case, deletion of 1.1 may reduce both K1,L and kopen, while increasing K1,E.

Kinetics and Mechanism of Open Complex Dissociation
Dissociation of the stable open complex RPO to free promoter DNA upon addition of a competitor (to make dissociation irreversible) exhibits first order, single-exponential kinetics with rate constant kd ( Figure 4). For the two promoters investigated, ȜPR [65,66] and lacUV5 [61], both of which form long-lived RPO complexes at 37 °C, kd decreases strongly with increasing temperature. This negative activation energy of dissociation indicates that one or more intermediates are kinetically significant in dissociation, and (together with the single-exponential kinetics) indicates these intermediates are in rapid equilibrium with RPO on the time scale of the rate-determining step in dissociation. These intermediates occur after the "bottleneck" step in the forward direction and do not contribute to the rate of RPO formation. How many intermediates are there, and are they open or closed? A combination of kinetic and fast permanganate footprinting studies reveal that at least two such intermediates are kinetically significant at ȜPR (designated I2 and I3 in Figures 3 and 4); these are open complexes with the bubble extending from í11 to +2 (like RPO) which are much less stable than RPO [23,24]. DNA closing occurs in the conversion of I2 to I1,L [23]; this step is rate-determining in dissociation [24], as indicated in Figure 5. Though the barrier for conversion of RPO to I2 is much larger than that for conversion of I2 to I1, the rate constant k-3 for conversion of RPO to I2 does not determine the rate of dissociation because this step is rapidly reversible on the time scale of conversion of I2 to I1. Hence, the equilibrium constant K3 for conversion of RPO to I2 and not the rate constant k-3 appears in the equation for kd in Figure 4.
Even The observation of single exponential dissociation kinetics means that I2 and any other intermediates like I3 rapidly equilibrate with RPO on the time scale of conversion of I2 to closed complexes [139]. In other words, in dissociation of RPO without a high salt upshift (see below), any I2 formed usually reverts to RPO but occasionally undergoes the rate-determining DNA closing step to form I1,L, which rapidly dissociates. Analysis of Mechanism 3 relates the observed rate constant kd for RPO dissociation to the DNA closing rate constant k-2: kd = fI2k-2 (8) where fI2 is the fraction of open complexes that are I2: To date the intermediate open complex I3 at ȜPR has been treated as part of the RPO population, as shown in Figure 4, because neither the equilibrium constant nor rate constants for forming I3 from RPO have been determined [24]. Hence, the equilibrium constant K3 in Equation (9) is a composite of those for the conversions of I2 to I3 and I3 to RPO.

Determining the DNA Closing Rate Constant (k-2) and the Stabilization (K3) of the Initial Open Complex
Fast salt-upshift dissociation experiments provide a valuable method of determining the DNA closing rate constant k-2 [24]. For ȜPR (and presumably other promoters with long-lived open complexes), the stabilization equilibrium constant K3 decreases strongly with increasing salt or urea concentration, so a fast salt or urea upshift rapidly converts the initial population of RPO to I2 [24]. Because the DNA closing step is found to be independent of salt concentration, a transient burst of I2 is obtained, suitable for determining the DNA closing rate constant k-2 [24] and for fast DNA footprinting of I2 [23]. Moreover, once determined at high salt concentration, k-2 is used at lower salt concentrations to dissect kd and determine K3 from Equation (9). For ȜPR at 37 °C, RPO lifetime (1/kd) is approximately 11 hours, while that of I2 (1/k-2) is about 1 second. Therefore, K3 is about 10 5 , and RPO is approximately 10 5 fold more stable than I2.
The strong dependences of K3 on urea and KCl concentration [18,24] and the effect on K3 of deleting the ȕ' jaw or the downstream duplex [134] indicate that a major part of this stabilization involves assembly of the jaw and other DMEs on the downstream duplex, schematically illustrated in Figure 3. Movements of the downstream (discriminator) region of the nontemplate strand [23] and of RNAP elements in the cleft in concert with tightening of interactions in the cleft are also implicated in stabilization of the initial open complex.

Fundamental Similarities of the RNAP-Promoter Mechanism to a Mechanism of Enzyme Catalysis; Implications for Regulation of Open Complex Formation and Lifetime
Mechanism 1 is of course formally the same as the minimal two-step mechanism of enzyme catalysis. The hyperbolic dependence of kobs on [RNAP] in Equation (1) is completely analogous to the hyperbolic dependence of the initial velocity of the enzyme catalyzed reaction on substrate concentration for noncooperative enzymes; K1 and k2 are the counterparts of 1/KM and kcat in enzyme kinetic analysis. Another fundamental analogy, at least for ȜPR, is that the DNA opening-closing step in mid-mechanism is rate-determining in both directions of the mechanism, just as the central catalytic steps are typically rate determining for both directions of an enzyme catalyzed reaction.
Regulation of the kinetics of enzyme catalyzed reactions by ligands or cooperativity of multi-subunit enzymes is primarily at the level of the initial reversible steps of substrate binding and conformational change that precede the central catalytic step, while the catalytic step itself is typically not regulated. Is this also true of the kinetics of open complex formation and the lifetime of the stable open complex? Both the initial binding step and the conformational changes in the I1 ensemble that prepare the DNA to be opened are highly regulated; equilibrium constants for these steps are strong functions of promoter sequence and length [142] and of concentrations of transcription factors, ligands, solutes and salts [24,60,[63][64][65][66]. Likewise, the steps converting the initial open complex I2 to the stable open complex (RPo at ȜPR) are strong functions of promoter sequence and solute and salt concentrations, and are, thus, also likely targets of regulation.
On the other hand, evidence to date indicates that while the DNA opening and closing rates are strongly temperature dependent (especially DNA opening), these analogs of the catalytic step are relatively insensitive to solution variables. For example, closing rate constants k-2 are moderately temperature dependent [24] but only weakly dependent on promoter sequence, and are similar for WT and deletion variant RNAP lacking ı1.1 and/or DME regions. Isomerization rate constant k2 for ȜPR, thought to be a close approximation to kopen (see above), is strongly temperature dependent [60] but is independent of urea concentration and is only weakly salt concentration dependent [24,60,63,64]. We therefore propose that the central DNA opening-closing step is relatively universal and unregulated for Eı 70

Conclusions
In this review, we summarize evidence for the series of large conformational changes, set in motion by binding of bacterial RNAP to promoter DNA to form the initial closed complex and involving bending and wrapping of upstream DNA, which allow the downstream duplex DNA to be bent into the active site cleft of RNAP to form an advanced closed complex. Studies of full-length and upstream truncated promoters indicate that formation of this advanced closed complex is necessary for the subsequent bottleneck step in which the transcription bubble is opened using binding free energy, placing the template strand in the RNAP active site and forming an initial open complex. We also review the evidence that, at some but not all promoters, the initial open complex is stabilized and its lifetime greatly increased by a network of interactions involving the discriminator region of the nontemplate strand and mobile in-cleft and downstream elements of RNAP.
We review the strong formal parallels between the kinetics and mechanism of RNAP-promoter open complex formation and stabilization and the kinetics and mechanism of enzyme catalysis. Both mechanisms divide into three classes of steps with the central bottleneck step (catalysis, DNA opening; the focus of the mechanism) bracketed by reversible binding and conformational steps. In enzyme catalysis, reversible substrate binding and conformational steps are the focus of regulation by inhibitors, activators, and cooperative enzymes, while the central catalytic step is relatively unregulated. Evidence indicates that a similar, previously-unrecognized principle apply to regulation of transcription initiation. Most regulation of the rate of open complex formation occurs in the rapidly-reversible binding and closed-complex conformational steps that precede the central DNA opening step. Likewise, most regulation of open complex lifetime is in the steps that stabilize the initial open complex. The intrinsic DNA opening-closing step, the analog of the catalytic step, appears relatively universal and unregulated, with rates in both directions of approximately 1 s í1 at 37 °C.