Accumulation of Stable Full-Length Circular Group I Intron RNAs during Heat-Shock

Group I introns in nuclear ribosomal RNA of eukaryotic microorganisms are processed by splicing or circularization. The latter results in formation of full-length circular introns without ligation of the exons and has been proposed to be active in intron mobility. We applied qRT-PCR to estimate the copy number of circular intron RNA from the myxomycete Didymium iridis. In exponentially growing amoebae, the circular introns are nuclear and found in 70 copies per cell. During heat-shock, the circular form is up-regulated to more than 500 copies per cell. The intron harbours two ribozymes that have the potential to linearize the circle. To understand the structural features that maintain circle integrity, we performed chemical and enzymatic probing of the splicing ribozyme combined with molecular modeling to arrive at models of the inactive circular form and its active linear counterpart. We show that the two forms have the same overall structure but differ in key parts, including the catalytic core element P7 and the junctions at which reactions take place. These differences explain the relative stability of the circular species, demonstrate how it is prone to react with a target molecule for circle integration and thus supports the notion that the circular form is a biologically significant molecule possibly with a role in intron mobility.

. Intron processing pathways. The ratios of intron full-length circle (FLC) and the intron embedded homing endonuclease mRNA (HEG mRNA) copy numbers determined in parallel by qRT-PCR for the experimental conditions described in the paper. In the lower row, the ratios are normalized to exponentially growing cells at 25 °C. The primers for FLC specific amplification were similar to those described in the main paper. For HEG mRNA specific amplification, the 5′ primer (C397: 5′-GCGTTCTAGGCCCGATG) was designed to span the exon-exon junction created by splicing of the small 51 nt spliceosomal intron found in the HEG of the Dir.S956-1 group I intron. The 51 nt intron is removed as the last step in a series of processing steps leading to the mature mRNA encoding the homing endonuclease and the resulting exon-exon junction sequence that is not present in the precursor or the FLC [1]. Amplification using c397 and a 3′ primer (C398: 5′-CCT TGC TTG TCT GGA TCC TC) resulted in a PCR-product of 126 bp. Table S2. Summary of results from Pb 2+ and in-line probing of nucleotide 200-318 of DiGIR2 FLC and L-IVS. Signal strengths are indicated as "1" (weak), "2" (intermediate), or "3" (strong). "*": no signal; nd: not detected due to proximity to primer. Some successive signals could be due to reverse transcriptase "stuttering". This has only been corrected in the most obvious cases.

Description of the Overall Structure of the L-IVS and the FLC
P4-P6 domain. This domain has been shown in some introns to fold early and act as a scaffold in folding of subsequent domains [2]. In DiGIR2, P5 is extended by an asymmetrical internal loop and a helix (P5a). The P5 and P6 helices in L-IVS were confirmed by V1 cleavages and their capping loops were accessible to extensive chemical modification. The central part of the molecule appears to be inaccessible to modification and RNase cleavage. The asymmetrical loop connecting P5 and P5a is readily modified by chemical probes and appears not to contain Watson-Crick base-pairs. Interestingly, two of the strongest V1 cleavage sites overall are found in the single-stranded J5/5a at A106 and C107 indicating stacking of these residues (see Figure S2c). The probing results from FLC were very similar to that of L-IVS, but with some exceptions. An increased accessibility of probes to the 5′ strand of the top of the domain in FLC and a decreased accessibility of probes to the 5′ strand of P6 and parts of L6 indicate a slightly different positioning of this domain in FLC compared to L-IVS, and is most likely linked to the absence of P1 in FLC. Overall, the structure of the P4-P6 domain was confirmed and did not contain major structural alterations in the FLC.
P3-P9 domain. This domain harbours the binding site for the guanosine co-factor and ωG in P7. In DiGIR2, the P9 part is very complex consisting of a 2-bp helix (P9.0) and three hairpins branching from this, some of which have been specifically assigned a role in the circularization pathway [3]. The probing strategy excluded data collection from P9.0″ and P9a″ and the most 3' part of P9.2 in the L-IVS due to overlap or proximity to the primer used in reverse transcription. Likewise, no data from Pb 2+ , in-line and RNase probes is available for 16-22 nucleotides spanning the FLC circle junction due to the circular RNA purification method. From analysis of those parts that are amenable to analysis, P9.0, P9a and P9b appear inaccessible to chemical modification and RNase cleavage, indicating that this part of the structure is buried within the ribozyme. P9.1 and P9.2 are supported by V1 cleavages and the bulge in P9.2 is accessible to chemical modification. In contrast the 9-nt L9.2 and the 10-nt internal loop in P9.1 are only modifiable in a few positions, indicating that they form structured motifs. An 8-nt stretch of L9.1 is cleaved at several positions by V1 and is not modified by chemical probes, indicating an involvement in a long-range base-pairing (P13; see below). P7 and P8 were confirmed by the presence of V1 cleavages, and the joining segments J8/7, J7/3, J7/9.0, and L8 by accessibility to chemical modification. The probing results for FLC were similar to that of L-IVS in P3, P8, and P9. However, several differences were noted in and around P7. The V1 cleavages at C182 and G233 were not observed in FLC. In accordance with this, the FLC was accessible to DMS modification at A179, and kethoxal modification at G233 and G235. In addition, A231 in J8/7 and A178 in J6/7 were modifiable by DMS in FLC, but not in L-IVS. Taken together, the results indicate a structural alteration of the G-binding site in the FLC compared to the L-IVS.
P1-P2 domain. This is frequently referred to as the substrate domain because it harbours the 5′-splice site. In DiGIR2, the P1-P2 domain consists of an 8 base-pair P1 including 6 base-pairs between the internal guide sequence and the 5′-exon, as well as P2 and P2.1 helices. In the DiGIR2 L-IVS, the 5′-exon is removed and the guanosine co-factor is attached to the 5′-end of the intron. L2 is artificial in the sense that DiGIR1 and HEG have been removed by deletion. However, P2 is roughly similar to P2 known from other group IE introns not carrying an insertion. The remainder of P1 is readily accessible to chemical modification, as expected. P2 and P2.1 are confirmed by numerous V1 cleavages and L2 is modifiable with the chemical probes. The major part of L2.1 appears protected against chemical modification consistent with its involvement in a long-range base-pairing interaction with L9.1. In all, the P1-P2 domain appears as the most accessible domain to the probes applied in this study. Obviously, the FLC differs from the L-IVS in that the exoG is missing and the 5′-and 3′-nucleotides of the intron are covalently linked. The nucleotides on the 5′-side of the junction are accessible to chemical modification and thus appear to be solvent exposed. In contrast, nucleotides 3′-to the junction appear inaccessible. The modification pattern of the circle junction region of the FLC is slightly different from that found within the 5′ end of the L-IVS indicating structural differences. The P2-P2.1 part of the domain is mostly similar in the two molecular species.
P13 long-range interaction. This group IE characteristic long-range base-pairing interaction was confirmed by numerous V1 cleavages on the L9.1 strand that consequently is proposed to constitute the surface exposed part of the helix. P13 is found in both the L-IVS and the FLC.

3D Modeling of the Intron (Figure S4)
The model of the core of the L-IVS was built by homology modelling with the Azoarcus group I ribozyme crystal structure [4,5]. Using this strategy, all nucleotides equivalent to specific positions in the Azoarcus ribozyme from the P3-P9 and the P4-P6 domains were built in accordance with secondary and tertiary interactions that form upon domain assembly. In this context, the precise positioning of the P9 hairpin allowed for the identification of the receptor of the L9 GCGA tetraloop. This receptor is located in the P5 region proximal to the J4/5 internal loop and corresponds to the consecutive base-pairs C97 = G129 and U98-A128 which interact with the sugar edge of A250 and G249 from the tetraloop, respectively. These interactions are in agreement with the covariation usually observed within this tertiary structure motif [6] and are moreover supported by sequence covariation in a set of representatives of the group IE ribozymes ( Figure S4). Homology modelling was also applied to build the characteristic GoU base-pair of the P1 substrate helix onto the core. The P1 hairpin was further built in the form appearing in L-IVS before the first step of the circularization pathway. The differences between DiGIR2 and the Azoarcus ribozyme cores were addressed by taking advantage of the more closely related Tetrahymena thermophila ribozyme crystal structure deprived of P1, P2 and P2.1 [7]. In DiGIR2, P3 is a 7-bp stem with a bulge between the fourth and fifth base-pairs. J8/7 and J3/4 are 6-nt and 3-nt long, respectively. This situation is specific to DiGIR2 and is closer to introns that contain a P2/P2.1 extension like the Tetrahymena IC1 ribozyme. The group IC3 introns, like the Azoarcus ribozyme, contain a simple P2 hairpin. In these, P3 is 6-bp long, J8/7 and J3/4 are 6-nt and 4-nt long, respectively. In accordance with the Tetrahymena ribozyme crystal structure, a triple interaction was built in the DiGIR2 model by docking the Watson-Crick edge of U224 in the narrow groove of the second base-pair of P3, A77-U194. Concerning J3/4, the 5′ residue of J3/4 in the Azoarcus ribozyme falls in place with U83 from DiGIR2 involved in the terminal base-pair of P3 with A189. Thus, the additional base-pair of P3 in group IE introns compensates a shorter J3/4 as compared to group IC3 ribozymes. Figure S3. Sequence covariation between the P5 receptor and L9 sequence. Intron nomenclature is according to [8]. After building the L-IVS core, the regions non-homologous to the Azoarcus ribozyme, namely P2-P2.1 and the P9.1-P9.2 extensions characteristic of group IE introns [9] were modelled step by step in order to form the P13 pseudoknot (unpublished data). The model of the full intron of group IC1 from Tetrahymena [10] which displays a P13 element also formed by the interaction between the loops of P2.1 and P9.1 was used as a starting point to elucidate the conformation of these appendages in DiGIR2 ( Figure S5a,b). Accordingly, the P2 and P2.1 elements were stacked head to head to form a rod roughly orthogonal to the P1 helix. The P2.1 helix was oriented towards the P3-P9 side whereas P2 was directed towards the P4-P6 domain. On the opposite side of the ribozyme, the P9 domain consists in a four-way junction (4 WJ) composed by P9a, P9b, P9.1 and P9.2 that extends the 2-bp P9.0 element stacked onto P7 that allows the ribozyme 3′-residue ωG to be accommodated in the G-binding pocket. In order to allow L9b to interact with P5, P9a had to be stacked with P9b. Furthermore, P9.2 was stacked with P9.1 to allow the latter to lie along P7/P3 and interact with L2.1 to finally form the P13 pseudoknot. The conformation of the elements encompassing P2/P2.1 and the P9 insertion is moreover supported by the observation that V1 cleavages are only observed on the P13 strand belonging to P9.1 which is indeed exposed to the solvent in the model whereas the opposite strand is buried.
The P5a appendage in DiGIR2 is much shorter than in the Tetrahymena ribozyme [7] and about the same size as in the Twort ribozyme [11]. In the crystal structure of the former, the P4-P6 domain is entirely visible in the density map and adopts the same overall fold as the independent domain [12] with the tetraloop L5b interlocked with its receptor located at the junction between P6a and P6b. In the latter, the P5a appendage is not visible in the crystal structure beyond the P5 receptor of the L9 loop. Chemical and enzymatic modifications in DiGIR2 show that the P5a extension folds as a hairpin with residues from the L5a loop and from the internal loop connecting to P5 extensively accessible to Watson-Crick probes. Thus, P5a appears to point into the solvent in an undefined direction rather than folding back to make specific contacts with other regions of the P4-P6 domain. This conclusion is supported by sequences of ribozymes phylogenetically related to DiGIR2 showing that P5 is often extended by Watson-Crick pairs or by motifs unable to kink the helix, such as asymmetrical bulges or C-loops [13].
The present study represents the first whole atom modelling of a group IE intron ribozyme. Previously, secondary structures and cylinder models of three subgroups of group IE introns have been proposed [9]. In addition, the group IE intron from Candida has been modelled based on Fe(II)-EDTA and T1 cleavage patterns [14]. Our experimental data and modelling generally conform to the previously proposed models [9,15]. In particular, the characteristic peripheral elements P2.1, P9.1 and P9.2 and the long-range interaction P13 were all incorporated into the model. In addition, our model contributes important new structural features each specific to the splicing or full-length circularization pathway following the observation of two distinct chemical and enzymatic probing patterns. Figure S4. The FLC and L-IVS DiGIR2 ribozyme 3D model overlay. Ribbon nucleotides of the two models were provided by structural homology to the Azoarcus group I ribozyme. The structural homologous part to the Azoarcus group I ribozyme are depicted in orange (P3, P4, P5, P7, P9b). Nucleotides from the P2/P2.1 and P9.1/P9.2 appendages are depicted in green and cyan, respectively, while the nucleotide extensions of P5a, P6, and P8 specific to DiGIR2 are light gray. The remains of P1 present in the L-IVS are depicted in yellow, while the circle junction of the FLC is materialized in purple. This overlay shows how the loss of the 5′ exon promotes opening of P1 due to the lack of stabilization. Further, the formation of the FLC circle junction leads to a rearrangement of the nucleotides in the region. The loop L9.2 interacts with the J9.0/9a junction promotes and stabilizes the circle junction. The internal loop in P9.1 stabilizes P7 in both L-IVS and FLC molecules.