Molecular Mechanisms of Transcription Initiation at gal Promoters and their Multi-Level Regulation by GalR, CRP and DNA Loop

Studying the regulation of transcription of the gal operon that encodes the amphibolic pathway of d-galactose metabolism in Escherichia coli discerned a plethora of principles that operate in prokaryotic gene regulatory processes. In this chapter, we have reviewed some of the more recent findings in gal that continues to reveal unexpected but important mechanistic details. Since the operon is transcribed from two overlapping promoters, P1 and P2, regulated by common regulatory factors, each genetic or biochemical experiment allowed simultaneous discernment of two promoters. Recent studies range from genetic, biochemical through biophysical experiments providing explanations at physiological, mechanistic and single molecule levels. The salient observations highlighted here are: the axiom of determining transcription start points, discovery of a new promoter element different from the known ones that influences promoter strength, occurrence of an intrinsic DNA sequence element that overrides the transcription elongation pause created by a DNA-bound protein roadblock, first observation of a DNA loop and determination its trajectory, and piggybacking proteins and delivering to their DNA target.


Introduction
The study of the galactose (gal) operon, which encodes enzymes for an amphibolic pathway of D-galactose metabolism, first revealed a plethora of gene regulatory mechanisms by which bacterial genes are regulated: (i) beside substrate induction of a specific catabolic pathway or end-product repression of a specific biosynthetic pathway, accumulation or depletion of metabolic intermediates in the cell globally regulates the expression of a wide variety of genes to compensate for the accumulation or depletion [1]; (ii) use of more than one promoter to regulate an operon [2]; (iii) the mechanism of Rho-mediated premature transcription termination [3]; (iv) gene regulation by a DNA element located within a structural gene [4]; (v) DNA looping to repress gene transcription [4]; (vi) the global gene activator, CRP, can also represses a gene [2]; (vii) demonstration of trajectory of DNA loops [5]; and (viii) phage protein mediated transcription anti-termination in bacterial genes [6]. Here we review more recent revelations that provide several new aspects of the multiple regulatory pathways by which gal promoters are regulated at the transcription level.

The Gal Operon
The gal operon is transcribed from two overlapping promoters, P1 and P2, with transcription start points marked as +1 and í5, respectively ( Figure 1) [2,7,8]. Why two promoters? Each promoter responds to different regulators for coping with physiological needs as enzymes encoded in the gal operon are needed for both catabolic and anabolic metabolisms. Both promoters are intrinsically expressed at significant levels. Nonetheless, cAMP and its receptor protein CRP complex (CCC) enhances P1 but represses P2, whereas GalR represses P1 and enhances P2 (details below). CCC acts by binding to a single site AS and GalR to two operators. Figure 1. The gal operon. The gal structural genes, galETKM, encode the enzymes epimerase, transferase, kinase and mutarotase, respectively. The transcription start point (tsp) of P1 is +1 and that of P2 is í5 The numbering system is relative to +1, with numbers downstream of +1 as positive (+) and numbers upstream as negative (í). Operators (OE & OI), hbs: HU binding site, AS: activating site. GalR (green) binds to two operators (OE & OI). HU (blue) binds to the HU binding site (hbs) and cAMP-CRP (yellow) complex binds to the activating site (AS). The map is not drawn to scale.

Intrinsic Strength of Promoters
Since the tsps for P1 and P2 are separated by 5-bp (half of a helical turn on B-DNA), the two promoters are located on opposite faces of the DNA. The intrinsic strength of each promoter depends on the contribution of several critical base pairs in the promoter region [9]. Both P1 and P2 do not have functional í35 elements but are composed of ex-10 and í10 sequences as authenticated by mutational studies [10,11]. In the absence of regulatory protein, P2 is transcribed 3-fold more efficiently than P1 [12]. The intrinsic strength of a promoter was postulated to be dependent on the presence of and the closeness of DNA sequences of the different elements to their consensus forms so that the frequency of occurrence of a base pair at a given position of the element reflects its relative importance in promoter function [13]. But, the significance of the base pair frequency concept in promoter strength was developed without regard to the context sequence. It is probable that the contribution of a base pair to the promoter strength may depend upon the presence of a specific base pair at another seemingly unrelated position in the promoter. This would not be known by looking for consensus sequences among heterologous promoters. A meaningful approach would be to assess the contribution of a base pair at a given position in the promoter under the context sequence that was kept constant. To study the effects of individual base pairs on the intrinsic strength of the promoters, each base pair in the overlapping gal promoter region (from í20 to the +5) was mutated systematically to the other three base pairs and the promoter activities were analyzed by an in vitro transcription assay [14]. First, it was observed that purines at the non-template strand at the tsp of P1 and P2 are favorable for the initiation of transcription while pyrimidines are unfavorable with a preference for A = G >> C = T at the tsp ( Figure 2). The tsp is determined by counting 12 base pairs from the "master base" í11A (see below) located within the í10 element of P1 and P2 [15,16]. Next, base pairs í7T, í11A, and í12T were found to be critical determinants of promoter activity. Mutating the corresponding í7T, í11A or í12T to another base inactivated P1 and P2 [14]. In addition, base pairs in the ex-10 elements (í15T and í14G) of P1 and P2 were also critical for promoter activities as expected from previous results. In summary, the base pair frequency within known consensus elements correlated well with promoter strength. Surprisingly however, P1 and P2 promoter strengths increased by substitution of several native base pairs by some others located in the í20 to í16 segment, i.e., outside the ex-10 and í10 standard elements of both promoters with a consensus sequence of í20 ATATA/G í16 for the region; no sequence requirement in that segment was predicted before. How this new sequence element influences promoters is unknown. The results of the exhaustive mutational analysis about DNA sequence requirements in gal promoters are summarized in Figure 2 [14].
The steps of closed and open complex formation in gal promoters were studied by the indirect abortive initiation method [17]. The mechanism of base pair opening during transcription initiation by RNA polymerase at the galP1 promoter was directly assayed by 2-aminopurine (2,AP) fluorescence [18]. The fluorescence of 2,AP is quenched when present in DNA duplex and enhanced when the 2,AP:T base pair is distorted or deformed. The increase of 2,AP fluorescence was used to monitor base pair distortion at several individual positions in the promoter. Base pair distortions during isomerization were observed at every position tested except at í11 in which the substitution created a defective promoter. The isomerization appeared to be a multi-step process. Three distinct hitherto unresolved steps in kinetic terms were observed, where significant fluorescence change occurred: a fast step with a half-life of around 1 s, which is followed by two slower steps occurring with a half-life in the range of minutes at 25 °C. Contrary to commonly held expectations, base pairs at different positions opened by 2,AP assays without any obvious pattern, suggesting that base pair opening is an asynchronous multi-step process. Note that 2,AP was used only at positions where there was an A in the "opening" region of the promoter. The DNA sequence from í25 to +1 of P1 and a summary of the effect of base pair changes from +1 to í25 on P1 transcription; (B) The DNA sequence from í20 to +6 of P2 and a summary of the effect of base pair changes from í20 to +6 on P2 transcription; (C) Consensus promoter region of P1 and P2 derived from the results shown in (A) and (B). R = A or G, N = any nucleotide. Base pair is in red if it is unique for promoter function, green if it improves promoter function, and black if it is degenerate. The symbol ">>>>" in vertical shapes represents 4.1-fold or more difference in promoter function from the wild type; ">>>", 3.1 to 4-fold; ">>", 2 to 3-fold; ">" less than 2-fold; "=" indicates equal (reproduced with permission from Elsevier, [14]).
The í11A within the í10 box is termed the "master control switch" because DNA melting and DNA strand opening first occur at í11 during isomerization from the closed to the open complex followed by opening at subsequent positions (í11 to +3) [15,19,20]. A mutant í11A does not allow base pairs at other positions to open whereas the reverse is not the case. Crystal structure studies showed that í7T and 11A flip out into hydrophobic pockets in an open complex [21,22]. This explains why any mutation in í7 and í11 positions results in the loss of gal promoter activities [14,15,19].
It also explains why í7T and í11A bases are highly conserved in the í10 element of promoters and play important roles during the formation of the open complex. It has been proposed that during isomerization, strand opening occurs from í11 to +3 to form a single-stranded DNA bubble, while í12T remains as part of the upstream double-stranded DNA bound to RNAP [14].

Role of CCC
The gal operon includes a 16-bp activating site (AS) located at í40.5 that binds the regulator CCC for activating P1 and repressing P2 ( Figure 3A) [8,[23][24][25]. A typical result of CCC action at the gal promoters is shown in Figure 3B. The overlapping of the AS at í40.5 with the í35 element of P1 is a feature of CCC-regulated Class II promoters. In contrast, in Class I promoters, the AS is located upstream (í61.5) to the promoter region for RNA polymerase (RNAP). between CCC (yellow) at the activating site (í40.5) and RNAP (brown) at the í10 and í35 elements of P1 (+1). P2 is located at í5 The ĮNTD and ĮCTD of RNAP contact both subunits of Class II promoters at CCC as shown in Figure 4 (adapted from [25]); (B) RNAs made typically from P1 and P2 promoters in the absence (í) and presence (+) of CCC as analyzed by gel electrophoresis. The concentrations of cAMP and CRP are 100 ȝM and 50 nM, respectively. RNAI is a control RNA in the plasmid (reproduced with permission from Elsevier [14]). CCC represses P2 by decreasing open complex formation of RNAP. In contrast, CCC activates P1 by increasing both closed complex formation and isomerization from the closed complex to the open complex of RNAP [17]. The AS of CCC is located on the same face as RNAP at P1 and on the opposite face of RNAP at P2. By binding to AS, CCC switches transcription initiation from P2 to P1. The activation of P1 by CCC is also dependent on the superhelical density of the DNA. The maximal P1 activity (12-fold) was observed at a superhelical density of í0.051, but the activity decreases at both higher and lower densities on a plasmid of 3528 bp [26]. In the absence of CCC, P2 activity is maximal (2-fold) also at a superhelical density of í0.051.
CCC induces transcription at P1, a Class II promoter, by making three different activatory contacts with different surfaces of holo RNA polymerase [34]. One of the contacts is located in the downstream subunit of the CRP dimer at the AS site and has been predicted to interact with region 4 of the RNAP ı70 subunit [27,39]. A cluster of negatively charged residues (D53, E54, E55 and E58) in AR3 of CRP interacts with a cluster of positively charged residues (K593, K597, R599 and R596) in ı70 ( Figure 4).
RNAP predominantly forms a binary complex at the P2 promoter in the absence of CCC and a ternary complex at the P1 promoter in the presence of CCC. Very high concentrations of heparin are able to dissociate CRP from the P1 ternary complex without changing the properties of the complex. Thus, CCC is not required for the maintenance of the RNAP complex and plays no role in the subsequent steps in P1 transcription as was true for several other promoters [40], suggesting that interaction between CCC and RNAP is needed only transiently for the activation of transcription.

CCC Action on Templates with Single bp Deletions
The role of individual base pairs from í49 to +1 on CCC action was investigated by systematically deleting each base pair and monitoring the effect of CCC on P1 activation and P2 repression ( Figure 5A). Deletion of one base pair from positions +1A to í10T (ǻ+1A to ǻí10T) does not affect the activation of P1 or the repression of P2 by CCC ( Figure 5B). The deletion of 1-bp shifted the next adenine from +3 in WT to +2 in the deletion templates (ǻ+1A to ǻí10T), allowing the tsp of P1 to initiate at the new +2A [16]. P2 with tsp of í5A was inactivated with single bp deletion from ǻí5A to ǻí8/9G, because, no adenine or guanine is available from í4 to í2 to initiate P2. Single bp deletion of í11A or í12T inactivated both promoters. Interestingly, the tsp of P1 initiated at the new +2A on ǻí11A as observed by the faint transcript band in the presence of CCC. In ǻí12T, P1 initiates at the WT +1A, suggesting that if the distance from í11T to +1 (12-nt) is shortened, RNAP will choose the next downstream purine to initiate transcription.
The í12T position is the first base of the í10 element of P1 and the last base of the í10 element of P2. It is not surprising that the intrinsic transcription of both P1 and P2 was inactivated, and CCC failed to activate P1. The í13C to í17T sequence contains the ex-10 of P1 ( í15 TG í14 ) and the í10 element of P2 ( í17 TATGCT í12 ). Sigma region 2.5 of RNAP recognizes the ex-10 motif of promoters [41][42][43][44]. Detailed analyses of the ex-10 showed the importance of the ex-10 element in transcription regulation [43]. Deletions of í16A and í17T/í18T result in approximately 5-and 2-fold activation of P1 by CCC, respectively ( Figure 5C). P2 was inactivated from í5A to í19G because its tsp, ex-10 and í10 elements ( í20 TGTTATGCT í12 ) are altered. From í20T to í33C, the regulation of P1 and P2 is restored.
CCC is known to protect the AS region in the gal DNA from í50 to í25 bp by DNase I protection assays [10,31,[45][46][47]. AS contained a non-consensus (NC) half-site and a consensus (C) half-site separated by a 6-bp spacer [48][49][50][51]. The AS extends from í49 to í34. Thirteen mutations each of which inhibits CCC action are located in the consensus half-site from í38 to í34 ( ) proximal to the promoters [23,52]. When í34A is deleted, there is only marginal activation of P1 and repression of P2 ( Figure 5D). There was no noticeable change in P1 or P2 levels in the absence or presence of CCC when a base pair in the consensus half-site is deleted. These results suggest that CCC fails to bind to AS when a base pair is deleted in the consensus half-site. The basal level of P2 was increased by 2-fold in ǻí38T. Perhaps a stronger í35 element of P2 is created with ǻí38T. The activation of P1 by CCC was restored with single base pair deletions upstream of the consensus half-site from ǻí39G to ǻí47/48/49T. P1 was activated only 4-fold in ǻí39G. These results suggest that the 6-bp spacer between the consensus and non-consensus half-sites do not affect CCC binding. These also suggest that mutations of the non-consensus half-site do not affect the activation of P1 by CCC. Interestingly, ǻí41A and ǻí46A are the only two mutations in P1, which were activated 9-fold in ǻí41A and 8-fold in ǻí46A by CCC. However, in both ǻí41A and ǻí46A templates, CCC activated P2 marginally. Busby and colleagues showed that the consensus half-site is inactivated by three substitution mutations, p35 (í35 CG to GC), p37 (í37CG to AT) and p38 (í38 TA to AT) [23,52]. They also showed that AS, unlike in WT, is not protected by CCC in p35, p37 and p38 mutants [10,46]. EMSA shows no stable complexes of CCC binding to a 144-bp DNA fragment containing p35, p37, or p38 mutations [46].
In summary, (i) the distance between í11 and +1 determines the start point selection of P1 and P2. If a purine is not available at +1, RNAP selects the next downstream purine within 12-13 bp from í11A; (ii) the í7T, í11A and í12T are critical bases of the í10 elements of P1 and P2 for promoter function. Any deletion or substitution of these bases prevents intrinsic transcription. CCC restores transcription from P1 in í7T and í12T, but not in í11A; (iii) both base pairs in the ex-10 elements ( í15 TG í14 ) are critical in both P1 and P2 because deleting or substituting one of them inactivates both promoters; (iv) any base pair deletion in the spacer region from í20 to í33 does not affect the activation and repression of P1 and P2 by CCC, respectively; (v) CCC fails to activate P1 or repress P2 when any base pair in the consensus half-site (í34 to í38) of AS is deleted; (vi) any base pair deletion except í41 and í46A in the non-consensus half-sites does not affect the regulation of the promoters by CCC. The conclusion from the results of single base pair deletions about the role of base pairs in the promoters are mostly the same as from the results of single base pair substitutions in the gal promoters. and P2 from WT and mutant templates (ǻí1 to ǻí15) (reproduced with permission from John Wiley and Sons); (C) mRNAs made from P1 and P2 from WT and mutant templates (ǻí16 to ǻí33); (D) mRNAs made from P1 and P2 from templates with WT and mutant templates (ǻí34 to ǻí49).

Regulation by GalR-O E Complex
To investigate the role of each operator in contact inhibition of P1, and contact activation of P2, OE or OI was subjected to mutational analysis ( Figure 6A). The result shows that the GalR-OE complex formation is sufficient for the repression of P1 ( Figure 6B) [53][54][55][56]. When OE was deleted, there was no inhibition of P1, no activation of P2. When OI was deleted, GalR-OE complex still repressed P1 and activated P2. When OI was deleted, the length of the transcripts from P1 and P2 was reduced by 16-nt since OI is located downstream of both promoters. There is no change in the length of the transcripts from P1 and P2 when OE was deleted because the tsps of the promoters are located downstream of the OE. The operon consists of two GalR binding sites (16-bp operators), OE (external operator, located at position í60.5) and OI (internal operator within galE, located at +53.5) [2,4,57]. The galE gene starts at an ATG (methionine code) located at position starting at +27. The operators are located 113-bp (~11 DNA helical turns) from each other (center to center distance) (Figure 1). GalR binds to each operator as a dimer.
The binding of GalR to OE represses P1 and activates P2 (Figures 6B and 7A). GalR bound to OE is located on the same DNA face as RNAP bound to P1 ( Figure 7A), but on the opposite DNA face as RNAP bound to P2 ( Figure 7B). GalR represses P1 by inhibiting the rate determining open complex formation through RNAP contacts [58]. This mode of repression is termed "contact inhibition" (Figure 7A). While P1 is repressed by GalR-OE complex, P2 is activated by GalR-OE complex by a direct contact between GalR and RNAP, "contact activation" (Figures 6B and 7B). GalR-OE enhances open complex formation at P2 presumably in the same way CCC does in P1 [12,53,59]. In P1, GalR energetically traps RNAP at an intermediary complex [53]. GalR mutants (nc, for negative control) that bind to OE and do not repress P1 but represses P2 have been isolated. These mutations presumably define the contact points of GalR to which RNA polymerase binds while occupying P1 and need to be characterized further [60]. The contact points of RNAP for GalR are unknown.

Roadblock of RNAP by GalR-O I Complex
The OI operator is located at position +53.5, which is in the path of elongating RNAP complex transcribing from P1 and P2. The question is whether GalR-OI complex can inhibit or block RNAP elongation [61]. Unexpectedly, transcription from the gal promoters under in vitro conditions overrides the expected physical block created by the presence of the GalR bound to OI (Figure 8). It has been shown that although a stretch of pyrimidine residues (UUCU) in the RNA/DNA hybrid located immediately upstream of OI weakens the RNA/DNA hybrid and favors RNA polymerase pausing and backtracking after encountering the roadblock, a stretch of purines (GAGAG) in the RNA present immediately upstream of the pause sequence in the hybrid acts as an anti-pause element by stabilizing the RNA/DNA duplex and preventing further backtracking. This facilitates forward translocation of RNAP, including overriding of the DNA-bound GalR barrier at OI [61]. Consequently, when the GAGAG sequence is separated from the pyrimidine sequence by a 5-bp DNA insertion, RNAP backtracking is favored from a weak hybrid to a more stable hybrid (Figure 8). The roadblock of RNAP by GalR-OI complex in the template with the 5-bp insertion was rescued by the transcription elongation factor, GreB, but not GreA. GreB and GreA cleave backtracked RNA in the catalytic center of RNAP to create a new 3'-end of the RNA, which can then be elongated [62][63][64][65]. As expected, the roadblock is also rescued by D-galactose, which dissembles the GalR-OI complex, allowing RNAP to continue transcription [61]. The ability of a native DNA sequence to override roadblocks in transcription elongation in the gal operon uncovers a previously unknown way of regulating transcription.

DNA Looping
Although GalR binding to a specific operator (OE or OI) has different regulatory outcomes, simultaneous binding of GalR to both operators (in the presence of HU; see later) represses both P1 and P2 ( Figure 6B). It was proposed that GalR bound to the distally located operators interact with each other forming a loop of the intervening DNA that contains the promoters ( Figure 9A). To test this model, a set of bipartite operators was constructed by converting gal operators to lac operators in various combinations ( , , , ) and gal repression was studied in vivo [66]. GalR and LacI are part of the GalR-LacI family, in which members show 60% homology in sequence [67]. Simultaneous repression of both promoters occurred only with homologous operators ( or ) in the presence of the cognate repressor ( Figure 9A-C) [66]. GalR does not recognize lac operators and LacI does not recognize gal operators. These results suggest that the occupation of both operators by heterologous proteins was not sufficient for complete repression of the promoters. It was

Mechanism of Repression by DNA Looping
Although an interaction between GalR-OE and GalR-OI complexes to generate a DNA loop was predicted from the in vivo results, in vitro experiments could not demonstrate DNA looping in gal DNA in the presence of GalR [68]. In vitro, DNA looping additionally needs the presence of the histone-like protein HU and supercoiled DNA (see below). HU assists the GalR-OE complex and the GalR-OI complex in stabilizing a higher-order complex structure, resulting in a DNA loop (see below).
The histone binding site (hbs) is located in the apex of the loop where the DNA is bent by HU binding, forming the higher-order structure and repressing both P2 and P1 ( Figure 6B) [68]. The higher-order structure is termed "repressosome" ( Figure 9A). P2 is repressed only by the repressosome. Incidentally, repression of both P1 and P2 at the same time by GalR-HU mediated DNA looping overrides the GalR-OE and CCC-AS mediated DNA looping differential regulation of P1 and P2. Mechanistically, synergistic binding of GalR to distal sites forms 113 bp DNA loop which is a topologically closed domain containing the two promoters [56]. A closed DNA loop of 11 helical turns, which is in-flexible to torsional changes, disables the promoters either by resisting DNA unwinding needed for open complex formation or by impeding the processive DNA contacts by an RNA polymerase in flux during transcription initiation. Interaction between two proteins bound to different sites on DNA modulating the activity of the intervening segment toward other proteins by allostery may be a common mechanism of regulation in DNA-multiprotein complexes.
As mentioned the P1 promoter of gal contains only ex-10 and í10 DNA elements and no í35 element. Thus, recognition of P1 does not require specific contacts between RNA polymerase and its í35 element region. To investigate whether specific recognition of the í35 element would affect the regulation of P1 by GalR, variants of P1 in which the í35 element was restored were constructed and their regulation by DNA looping were studied by in vitro transcription assays [69]. The results showed that the GalR-mediated DNA loop is less efficient in repressing P1 transcription when RNA polymerase binds to the í10 and í35 elements concomitantly. The most likely explanation of RNA polymerase binding to í35 element inhibiting DNA looping is that RNA polymerase binding to í35 element is known to create a bend in the DNA at an improper position inhibits DNA loop formation.  Figure 11) [72]. The geometry of the DNA loop ( Figure 11C) is antiparallel ( Figure 11B) and not parallel ( Figure 11D). AFM study with LacI and lac operators containing the same size DNA (lac operators replaced the gal operators) also reveals an antiparallel loop [72].

Helical Arrangement of Operators
The centers of the operators OE and OI are separated by 11 DNA helical turns, thus making the location of two operator-bound GalR on the same face of DNA. This arrangement energetically allows an interaction between two DNA bound GalR dimers, and consequent DNA looping and transcription repression, (Figure 12A). To investigate the dependence of transcription repression on the relative helical turns of the location of OE and OI, the helical arrangement of OE and OI was changed by either deleting 2-to 12-bp within positions í50 to í38 to decrease the number of helical turns or by inserting 1-to 21-bp between positions +32 and +33 to increase the helical turns [73]. Figure 12C shows that the optimal repression of gal RNA synthesis is achieved at a net distance of 103-, 113-, 123-, and 133-bp, corresponding to 10, 11, 12 and 13 full helical turns, respectively. However, when a 5-bp segment was deleted (108-bp distance) or 5-and 15-bp segments were inserted (118-and 128-bp distance respectively), gal repression was lifted as judged by the high expression of RNAs even in the presence of GalR, HU and supercoiled DNA presumably making the GalR-OE and GalR-OI complexes now located on the opposite face of the DNA more difficult to make GalR-GalR contact for DNA looping ( Figure 12B). Moreover, the loop size between OE and OI can be increased up to a total of 19 helical turns to maintain loping-mediated repression [72]. Above 19 helical turns, the repression was reduced, perhaps because a bound RNAP is able to overcome DNA torsional stress and form open complex.

Role of HU in DNA Looping
HU is a small basic protein consisting of a heterodimer of Į and ȕ subunits, and binds to a 9-bp DNA [74,75]. The hupA gene codes for HuĮ (~9 kDa) and the hupB gene codes for Huȕ (~9 kDa) [76,77]. The involvement of HU in DNA looping and the repression of the gal operon were investigated in vivo by monitoring the activity of ȕ-glucuronidase from gusA fused to the P2 promoter. The hupA and hupB genes were deleted (ǻhupA::cm R and ǻhupB::km R ) to generate hupA + B í , hupA í B + , and hupA í B í strains [78]. In wild-type strain (hupA + B + ), the ȕ-glucuronidase activity of P2 was repressed when both genes are present ( Figure 13A). In the presence of the hupA gene (hupA + B í ), P2 is strongly repressed as in wild-type cells, suggesting that hupA is sufficient to achieve complete repression of P2 by DNA looping. When hupA is inactivated (hupA í B + ), the repression of P2 is slightly weaker than that for hupA + B + and hupA + B í . The derepression of ȕ-glucuronidase activity of P2 was completely constitutive only in the hupA í B í strain. The P2 activity of hupA í B í is comparable to that of P2 activity in the presence of D-galactose, an inducer of the gal operon ( Figure 13B) [78].

DNA Supercoiling
DNA loping by GalR and HU occurs only with supercoiled DNA as was observed by in vitro transcription of P2. P2 repression was totally dependent upon with supercoiled DNA template [78]. Moreover DNA looping mediated repression in vivo requires supercoiled chromosome [71]. Coumermycin, a DNA gyrase inhibitor, also derepressed the P2 promoter as expected when it was added to cells in the absence of D-galactose ( Figure 13C).

Piggybacking HU
GalR mediated DNA looping requires binding of HU to an architecturally critical position on DNA (hbs) to facilitate the GalR-GalR interaction. It has been shown that GalR piggybacks HU to the critical position on the DNA through a specific GalR-HU interaction [79]. The thermodynamic parameters of some of the required interactions, GalR-OE, GalR-GalR, HU-GalR, and HU-GalR-OE, were studied by analytical ultracentrifugation, fluorescence anisotropy, and fluorescence resonance energy transfer [80]. The physiological significance of several of these interactions was confirmed by the finding that a mutant HU, which is unable to help looping in vivo and in vitro, failed to show the HU-GalR interaction. The results helped to construct a pathway of DNA looping ( Figure 14). Structure-based genetic analysis indicated that the two DNA-bound GalR dimers interact directly and form a stacked tetramer in assembling a transient loop [81]. The loop is stabilized by HU leaving GalR and binding to the architecturally critical position on the DNA. The GalR-HU contact is likely transient and absent in the final loop structure. A sequence-independent DNA-binding protein being recruited to an architectural site on DNA through a specific association with a regulatory protein may be a common mode for assembly of complex nucleoprotein structures [80].

DNA Loop Trajectory
In the scheme of DNA looping as shown in Figure 14, the alignment of the operators in the DNA loop could be in either parallel (P) or antiparallel (A) mode ( Figure 15). Feasibilities of these trajectories were tested by in vitro transcription repression assays, first by isolating GalR mutants with altered operator specificity and then by constructing proper operator sequences to allow formation of mutant GalR heterodimers bound to specific hybrid operators in such a way as to give rise to only one of the two putative trajectories (parallel (P) or antiparallel (A)) [5]. A1 loop is formed when the 3'-end of OE is facing the 3'-end of OI in a head-to-head (ĺ ĸ) orientation. The A2 loop is formed when the 3'-end of OE is facing away from the 3'-end of OI in a tail-to-tail (ĸ ĺ) orientation. In P1, the 3'-end of OE is located in the same direction as the 3'-end of OI in a head-to-tail (ĸ ĸ) configuration, while in P2, the 3'-end of OE is located in the opposite direction as the 3'-end of OI in a tail-to-head (ĺ ĺ) configuration. Results show that OE and OI adopt a mutual antiparallel orientation in an under-twisted DNA loop, consistent with the energetically optimal structural model. In this structure the center of the HU-binding site is located at the apex of the DNA loop ( Figure 15). The arrows at OE and OI are shown in the direction 5' to 3'; (B) Two trajectories of DNA loops, "antiparallel" (A), and "parallel" (P); (C) mRNAs made from P1 and P2 in antiparallel or parallel configurations (reproduced with permission from Cold Spring Harbor Laboratory Press, [5]).

Single Molecule Evidence of DNA Looping
Single DNA molecule experiment was first used to demonstrate looping in the lac operon with a linear DNA [82]. The same principle was employed to demonstrate DNA looping in gal in which the extra factor HU and supercoiling of the DNA were needed [83]. Single DNA molecules each containing two operator sequences 113 bp apart, with one end tethered to a magnetic bead and the other to a surface can be twisted to mimic DNA superhelicity by using small magnets placed above the sample and the end-to-end distance measured. Under such conditions DNA loop formation by GalR and HU reduced the bead-to-surface distance by an expected amount. GalR/HU-mediated DNA looping was directly detected and characterized for its kinetics, thermodynamics, and supercoiling dependence. Transitions in DNA length between unlooped state and looped state were observed in the presence of GalR and HU ( Figure 16A). There was no transition in the absence of either GalR or HU. The optimal super helical density (ı) for looping was í0.03. Looping was not observed with untwisted (relaxed) DNA making negative supercoiling an essential element for looping in this system unlike loop formation in lac ( Figure 16B) [82]. These experiments also confirmed that DNA looping in gal occurs with an antiparallel DNA trajectory of the two operator [84].

GalR-GalR Interface for DNA Looping
How does GalR-OE complex interact with GalR-OI complex to bring about tetramerization and DNA looping? This question was addressed by isolation and characterization of single amino acid galR mutants, which bind to DNA (P1 repression proficient) but does not form DNA loop (P2 repression deficient) [85]. A reporter gene of OE P1 + P2 í~l acZ was used to monitor P1 repression and an OE P1 í P2 + OI ~gusA fusion was used to screen for P2 repression. Such GalR mutants (defective in GalR-GalR interactions, and thus DNA looping but retains DNA binding to OE), R325H, D258N and E230K, were located on a surface of a model structure of GalR dimer structure. The area can act as interface between two GalR dimers [85]. In vitro studies confirmed that the interface mutants, R325H, D258N and E230K, do not repress P2 but repress P1.

Induction of Gal Operon by D-Galactose
D-Galactose, an inducer of GalR, acts by inactivating GalR. D-galactose binding to GalR results in an allosteric change in the protein, which cannot contact RNAP or bind to DNA anymore. This neutralizes any regulatory effect of GalR. D-galactose is a mixture of both Į-anomer and ȕ-anomer [86]. Purified Į-anomer and ȕ-anomer were used to investigate whether the Į-anomer or ȕ-anomer or both can inactivate GalR for P1 transcription from the repressed state of the promoter in vitro. The result showed that both Į-anomer and ȕ-anomer act as inducer by lifting the repression of P1 transcription without DNA looping ( Figure 17). How does D-galactose disrupt the repressosome structure? In case of DNA looping mediated repression of P2, the presence of D-galactose first breaks up GalR-GalR tetramers into individual operator-bound dimers in which state P1 is still largely repressed and P2 is derepressed [87]. Next, D-galactose helps to dissociate GalR from GalR-OI complex, and finally, GalR dissociates from GalR-OE complex to disassemble the remaining complex. This also confirms that GalR-OE complex is more stable than GalR-OI complex [87].

Conclusions
The gal operon of E. coli plays an important role in cellular metabolism by encoding enzymes that catalyze conversion of D-galactose to energy sources as well as to anabolic substrates. The operon is transcribed from two overlapping promoters, P1 and P2. The importance of individual base pairs at various positions in the í49 to +1 segment in the gal promoters for transcription and its regulation were discerned by substitutions, deletions or insertions of base pairs. First, a 12-bp distance from the master base pair (í11) is a determinant of transcription start point (+1), which is preferably a purine. Second, two of the standard RNAP recognition elements, ex-10 and í10, are responsible for determining the strength of the two gal promoters. Genetic analysis of the two overlapping promoters also identified that the DNA sequence of the segment í20 to í16, which is outside the boundaries of the previously defined promoter elements, contribute to promoter strength in both P1 and P2.
Regulatory proteins, CCC and GalR, regulate the two promoters coordinately and differentially depending on the cellular conditions. (i) CCC binds to the AS element at position í41.5 and activates P1 and represses P2 both at the step of open complex formation; (ii) GalR acts by binding to two operators, OE and OI. The GalR-OE complex inhibits P1 and stimulates P2 by contacting the ĮCTD of RNAP by preventing open complex formation at P1 and enhancing open complex formation at P2; (iii) Interestingly, the GalR-OI complex does not create a road-block to any elongating RNAP from P1 and P2 because of the presence of anti-pause sequence in the immediate upstream area that occludes pause; (iv) In the presence of the histone-like protein, HU, and supercoiled DNA as template, interactions of the two operator-bound GalR, results in the formation of a DNA loop (repressosome). The trajectory of DNA in the loop is antiparallel, as revealed by biochemical experiments and AFM observations. Genetic analysis of GalR identified the protein interface between GalR-OE and GalR-OI at the looped state. Looping needs the binding of HU to the apex of the looped DNA that stabilizes the repressosome complex. GalR interacts with HU and piggybacks the latter to its binding site. In the final structure, there is no GalR-HU contact. The helical arrangement between GalR-OE and GalR-OI is important for facilitating DNA looping. When they are located on the same face of the DNA looping is favorable; when not on the same face, energetics prevents DNA looping.