**4. Functional Importance of Lin28-Mediated mRNA and miRNA Regulation for Stem Cell Maintenance, Cancer and Development**

The functional importance of Lin28 in stem cell maintenance and reconstituting pluripotency becomes apparent when looking at the signaling pathways in which Lin28a/b are involved. Both paralogs are highly expressed in mammalian ESCs and are a central part of a conserved pluripotency network. For example, the expression of Lin28a is driven by the proto-oncogenic transcription factors Oct4, Sox2 and Nanog with Sox2 being most critical for an efficient Lin28a expression [55,56]. Once Lin28a is expressed, it antagonizes *let-7* and hence de-represses *let-7* targets such as c-Myc, Sal4, Igf2bps, Hmga2, various cyclins as well as Lin28 itself, thereby ensuring a constant expression of stemness factors and cell cycle regulator [57]. In addition, Lin28a directly or indirectly stimulates the translation of mRNAs encoding for cell-cycle regulators or growth-promoting factors such as Cyclin A/B, Oct4 and Igf2 [18,25,48,50]. Consequently, Lin28a/b up-regulate the expression of cell-cycle regulators and growth-promoting factors via *let-7* dependent and *let-7* independent mechanisms, thereby activating and maintaining signaling pathways that are important for self-renewal and proliferation. In agreement with this, Lin28a overexpression is not essential for reprogramming human fibroblast to iPSCs but strongly accelerates reprogramming by stimulating cell proliferation [58].

The strong effect of Lin28a/b on cell progression and proliferation [59] and the frequent re-activation of Lin28a/b in multiple cancers [12] supported the role of Lin28 as a potential oncogene. Indeed, Lin28a/b overexpressing in NIH/3T3 cells led to tumor formation in nude mice and was linked to depletion of mature *let-7.* As a consequence, oncogenic *let-7* targets such as c-Myc and N-Ras were de-repressed, and, since c-Myc itself transcriptionally activates various oncogenic miRNAs as well as Lin28b, a positive feed-forward loop is established [12,60]. Iliopoulus and colleagues revealed another positive feedback loop between NF-κB, Lin28b, *let-7* and IL-6 (Interleukin 6). Transient activation of Src tyrosine kinase in immortalized breast cells led to activation of NF-κB, which binds to the Lin28b promoter and induces its expression. As a result, Lin28b represses *let-7* processing, the *let-7* target IL-6 can be produced and activate NF-κB, thereby closing the positive feedback loop [61]. Similar to its role in reprogramming somatic cells to iPSCs, an elevated expression of Lin28a/b might also be important in the formation of cancer stem cells (CSCs) [62]. This subpopulation of tumor cells is thought to be essential for the propagation of some cancer cells and might arise in a reprogramming-like mechanism [63]. Hence, Lin28a/b reactivation would contribute to the formation of metastasis thereby explaining why Lin28a/b up-regulation correlates with tumor aggressiveness and an advanced tumor stage [12,62].

Given that *let-7* family members target numerous metabolic genes, it is not surprising that Lin28a/b overexpression also has an impact on growth, developmental timing and metabolism. Using genome-wide association studies, genetic variations within the LIN28B loci was linked to changes of human height, timing of puberty and the age of menopause [64–66]. Consistent with these studies, Lin28a overexpression in transgenic mice led to similar phenotypes and was associated with increased insulin sensitivity and increased glucose uptake [14]. On the molecular level, Lin28a/b act on multiple components of the insulin-P13K-mTOR pathway, thereby explaining why administration of the mTOR inhibitor rapamycin could rescue the Lin28a-mediated metabolic phenotype [15]. Further *in vitro* studies showed that Lin28a de-represses *let-7* targets of the insulin-P13K-mTOR pathway such as Igf1r, Insr, Irs2, Akt2, Tsc1 and Rictor [15,67]. The authors could not rule out that Lin28a/b associate with these mRNAs itself and enhance their translation. Recent genome-wide Clip-seq studies indeed suggested that Lin28a/b binds to mRNAs of insulin and Igf receptors, glycolytic and mitochondrial enzymes thereby modulating their translation directly [19,52,53]. Hence, Lin28a/b seem to regulate both mRNA translation and *let-7* maturation to coordinate proliferative signaling pathways and cellular metabolism in order to maintain the self-renewal potential of stem or progenitor cells. However, given the wealth of recently identified mRNA targets of Lin28a/b, their overlap with known *let-7* targets and the interwoven signaling pathways, it remains to be determined which of the identified targets indeed contribute to the observed physiological functions.

#### **5. Structural Basis for the RNA-Binding Specificity of Lin28**

#### *5.1. The Lin28 Zinc-Knuckle Domain Specifically Recognizes GGAG or GGAG-Like Motifs*

After identifying *let-7* precursors as major targets of Lin28a and Lin28b, several groups aimed to identify the specificity of this interaction. Using electrophoretic mobility shift assays with different pre-*let-7* sequences, it became initially apparent that the terminal loop of pre-*let-7* (also called pre-element or preE) is sufficient for Lin28a binding [33]. An alignment of stem-loop precursors of *let-7* revealed a highly conserved GGAG motif within vertebrates that is critical for Lin28 binding. Mutations within this motif (GGAG→AAAG and GGAG→GUAU) released the Lin28a-mediated block of pri- or pre-*let-7* processing and impaired TUT4-mediated oligo-uridylation of pre-*let-7* [7,22]. On contrary, introduction of the GGAG motif into preE of an unrelated miRNA (pre-*miR-16-1*) allowed Lin28a binding and TUT4-mediated uridylation of this chimeric pre-miRNA [7].

Due to the close homology between Lin28's ZKD and the ZKD of HIV-1 nucleocapsid protein (HIV NC), which was known to bind GGAG- or GGUG-containing loops within the HIV Ψ-RNA recognition element [68–70], it was suggested that Lin28's ZKD mediates a specific interaction with the conserved GGAG motif (see Figure 5A). Indeed, mutations with Lin28's ZKD specifically impaired pre-*let-7* binding as well as binding of the isolated Lin28 ZKD to GGAG-containing RNAs [7,34,35,71,72]. Co-crystal structures of a minimal mouse Lin28a construct with GGAG-containing oligonucleotides derived from preE-*let-7* (Figures 4 and 5) and a NMR solution structure of human Lin28a's ZKD bound to AGGAGAU provided the final proof for the supposed interaction (Figure 5B,C) [35,72].

For Lin28:mRNA binding, so far no structural data has been obtained. However, despite their discrepancies in individual mRNA targets, most of the above mentioned genome-wide HITS-CLIP and PAR-CLIP studies identified GGAG or GGAG-like consensus motifs within Lin28a/b binding sites. For example, Wilbert *et al.* found a highly enriched GGAGA(U) consensus sequence that was enriched within loop structures [54]. Cho *et al.* detected AAGNNG, AAGNG and UGUG motifs that are often located in terminal loops of small RNA hairpins [49]. Finally, Graf *et al.* detected GGSWG (S = G or C, W = A or T) or AAGRWG (R = A or G) motifs in Lin28b binding sites. Using individual domain PAR-CLIP (iDo-PAR-CLIP) Graf and colleagues further confirmed the GGGAG sequence as the top motif within Lin28 ZKD binding sites, whereas Lin28 CSD binding sites were rather U-rich [53]. These data indicate that the GGAG motif is indeed the major determinant of Lin28 RNA binding and is recognized by the ZKD. Even a mutation of the first or second guanosine only moderately impairs the interaction, thereby mirroring the overall flexibility of both ZKD and RNA. A recent study revealed that CCHC Zn knuckles can be used to design single-stranded nucleic-acid binding proteins that specifically recognize a number of guanosines [73]. This study further demonstrated that the length of the inter-knuckle linker affects spacing between specifically bound guanosines. Hence, Lin28 ZKD probably prefers GNNG motifs over NGNG motifs as seen for HIV-1 NC ZKD (see Figure 5C). Interestingly, TUT4 and TUT7 also contain CCHC Zn knuckles that are critical for pre-*let-7* oligo-uridylation [42,74]. Compared to Lin28, the distance between these knuckles is larger (37 aa), indicating that they act independently from each other.

**Figure 4.** Co-crystal structure of a minimal mouse Lin28a construct with preE-*let-7*d derived RNA (PDB ID 3TRZ). The ZKD specifically binds to the conserved GGAG motif, whereas Lin28 CSD establishes extensive interactions with the less conserved terminal hairpin loop.

The second CCHC Zn knuckle undergoes a larger structural change upon RNA binding, whereby the central Zn2+ ion moves about 25 Å. Responsible for this large conformational shift is Pro158 within the Pro-rich linker region, since its ψ torsion angle performs a 130° rotation (Figure 5B) [72]. While HIV-1 NC ZKD specifically binds G-2 and G-4 of a GGAG tetraloop in a sequence-specific manner, each CCHC Zn knuckle of Lin28 specifically recognizes the first and fourth guanosine of the GGAG motif by sequence-specific hydrogen bonds to the bases. Hydrogen bonding is mediated by backbone carbonyl and amide groups of residues that are located within the rigid parts of the CCHC Zn knuckle. In addition to this sequence-specific interaction, both G-1 and G-4 are sandwiched in a hydrophobic pocket by one conserved Tyr and His in the first Zn knuckle and another conserved His and Met in the second Zn knuckle (Figure 5C). In the case of mLin28a:GGAG structures, G-2 is also bound in a sequence-specific manner via hydrogen bonds from backbone carbonyl groups and the N1 amino group of A-3. Even more, A-3 contributes to the formation of a strong kink within the RNA backbone, since it also contacts G-1 [35] (Figure 5D). Although such a strong bending of the RNA backbone was not observed in the hLin28a:AGGAGAU structure, the imposed structural changes within RNA and protein likely lead to a constant opening of neighboring double-stranded pre-*let-7* stem thereby masking the Dicer cleavage site [34,35].

**Figure 5.** Lin28 ZKD specifically recognizes single-stranded GGAG or GGAG-like sequences. (**A**) Sequence alignment of HIV-1 NC, HIV-2 NC, hLin28a and hLin28b ZKDs. The chelating Cys and His residues of the CCHC Zn knuckles (ZnK) are shaded in red. Conserved residues are labeled from light red (100% type-conserved) to dark red (70% type-conserved); (**B**) Comparison between unbound hLin28a ZKD (green, PDB-ID 2CQF) and AGGAGAU-bound hLin28a ZKD (purple, PDB-ID 2LI8). Upon RNA binding, hLin28a ZKD undergoes a dramatic conformational shift mainly caused by a rotation of the Pro158 ψ angle; (**C**) In comparison to HIV-1 NC, the inter-knuckle linker of hLin28a ZKD harbors an additional Pro. As a consequence, the knuckles are further apart, thereby explaining why HIV-1 NC ZKD specifically binds G-2 and G-4 while hLin28a ZKD binds G-1 and G-4 of the GGAG motif in a hydrophobic pocket; (**D**) Structure of mLin28a ZKD bound to GGAG (derived from PDB-ID 3TSO). mLin28a is represented in green cartoon and the bound GGAG motif in purple (G) and pink (A). Tyr140 of the first and His162 of the second ZnK are key residues for the interaction, since they contact each other and stack with the bases, thereby establishing a kinked conformation in the RNA. All three guanosines are specifically recognized via various hydrogen bonds with backbone amide and carbonyl groups. In addition, G-1 and G-4 are bound in a hydrophobic pocket formed by His140, His162, Tyr140 and Met170.

**Figure 5.** *Cont*.

*5.2. The Lin28 CSD Has Broad Sequence Specificity and Can Induce Local Structural Changes within RNAs*

Despite Lin28's specificity for GGAG-containing RNAs, the isolated Lin28 ZKD is not sufficient for binding *let-7* precursors and blocking their processing [34,35]. So what is the contribution of Lin28's CSD with respect to sequence specificity, binding affinity and inhibition of pre-*let-7* processing?

CSDs are highly conserved RBDs that are widely distributed in bacteria, animals and plants and fulfill pleiotropic functions mainly related to RNA metabolism (reviewed in [75]). Bacterial major cold-shock proteins (Csps) share between 30% and 45% sequence identity to Lin28 CSDs and are known to bind pyrimidine-rich ssDNA/ssRNA oligonucleotides with affinities in the sub-nanomolar to micromolar range [76–81]. In addition to this, they can act as RNA chaperones that destabilize local RNA secondary structures [82–84]. Crystal and NMR structures of Csps have been known since the 1990s [85–89].

A systematic binding analysis with *Xenopus tropicalis* (*Xtr*) Lin28b CSD revealed that this domain has a broad sequence specificity and shows the highest binding affinities for pyrimidine-rich RNA octamers that contain at least one guanosine at the 5' end [34]. The observation was further confirmed by genome-wide PAR-CLIP studies, in which Lin28a/b binding sites were generally uridine-rich and flanked by one or more guanosines [19,53]. Moreover, these binding sites were typically located upstream of the corresponding ZKD binding sites, indicating a defined domain orientation of Lin28s' RBDs on RNA targets [53].

Co-crystal structures of Lin28 CSDs in complex with ssDNA and preE-*let-7* derived RNA stem loops provided valuable information about Lin28's specificity and function in pre-*let-7* and mRNA binding [34,35]. Lin28 CSDs bind to single-stranded nucleic acids via a conserved nucleic acid-binding platform mainly formed of exposed aromatic residues. Unlike for Lin28 ZKD, this binding platform is already pre-formed in the apo protein and, consequently, only subtle changes are

observed upon nucleic-acid binding (Figure 6A). Binding of ssDNA and ssRNA are remarkably similar and dominated by π-stacking interactions with exposed aromatic residues (Figure 6B). Consistent with solution binding experiments and bacterial Csp:ssDNA/RNA structures [79,90], Lin28 CSD binds up to 8 nucleotides arranged in a curved single strand with defined orientation. In the case of the mLin28a:preE-*let-7* structures, an additional ninth nucleotide is visible that establishes hydrogen bonds with the first base, thereby closing the preE stem loop. Sequence-specific binding is mainly mediated at position 6, since the presence of a conserved Lys-Asp salt bridge limits the flexibility and, consequently, the size of the binding pocket and contributes to specific hydrogen bonds with the U/T base. In addition to this, at binding subsite 2 either a T, U or G is specifically recognized within a hydrophobic pocket. Despite the difference in size, the corresponding bases are recognized by similar hydrogen bonds. The lack of contacts with the CSD allows the DNA/RNA backbone to adopt slightly different conformations without disturbing hydrogen bonding.

Besides its contribution to binding affinity and specificity, Lin28 CSD can affect and reorganize secondary structures within RNA targets. The first hint in this direction came from a study that examined the effect of Lin28a binding on pre-*let-7g* secondary structure using enzymatic foot-printing [36]. Upon Lin28a binding, some regions of preE as well as a part of the double-stranded stem of pre-*let-7* became more susceptible to cleavage by single-strand-specific ribonucleases. Hence, the authors concluded that Lin28a is able to unwind the double-stranded stem of pre-*let-7*, thereby blocking the Dicer cleavage site. Second, Nam *et al.* provided evidence that Lin28a's CSD can partially melt double-stranded stem loops to generate an optimal binding interface [35]. Third, using site-directed mutagenesis in combination with a kinetic analysis of *Xtr*Lin28b-mediated remodeling of pre-*let-7g*, it was shown that Lin28's CSD first binds to pre-*let-7* and induces a structural change [34]. Consistent with earlier studies on bacterial Csps [83,91], highly conserved His (His68 in *Xtr*Lin28b-binding subsite 4) and Phe residues (Phe77-binding subsite 1,2) were crucial for the remodeling reaction. The CSD-induced remodeling might be important for proper recognition of the GGAG motif by Lin28's ZKD, since in most pre-*let-7* structures the conserved GGAG motif is involved in secondary structures and therefore not accessible for binding (Figure 7). Genome-wide PAR-CLIP and HITS-CLIP studies further supported this hypothesis, since Lin28a/b could recognize RNA binding sites that are predicted to be involved in stable secondary structures [19,49]. Such a chaperone-like function of Lin28 might be an important regulatory mechanism that allows downstream RBPs either to dissociate from or associate with RNPs and influence their processing. Most notably, a recent study provided evidence that Dis3l2 exoribonuclease degrades oligo-uridylated pre-*let-7*. This enzyme is composed of one ribonuclease II domain, two CSDs and one CSD-like S1 domain. Interestingly, both the CSDs and the S1 domain were essential for Dis3l2 binding and degradation of oligo-uridylated pre-*let-7*. Given the preference of Lin28 CSD's for U-rich binding sites, this suggests that the CSD might recognize the oligo(U) tail. In addition, it may assist in the exoribonucleolytic degradation of oligo(U)-pre-*let-7* by partially unwinding the double-stranded miRNA stem.

**Figure 6.** The Lin28 CSD can bind to a wide range of different RNA sequences. (**A**) Superimposition of unbound (skin color, PDB-ID 3ULJ) and heptathymidine-bound *Xtr*Lin28b CSD (green, PDB-ID 4A76). Both structures are highly conserved and reveal a pre-formed nucleic-acid binding platform with exposed aromatic residues; (**B**) Superimposition of *Xtr*Lin28b:dT7 (green) and mLin28s:preE-*let-7f* (RNA: blue, protein: gray, PDB-ID 3TS0). Both Lin28 CSDs bind single-stranded nucleic acids predominantly via base stacking interactions in a defined orientation. The protein nucleic-acid interaction surface is similar for binding subsites 1 to 7. Binding of an additional eighth (U-8) and ninth (U-9) base in mLin28:preE-*let-7f* is triggered by the formation of a closed RNA loop; (**C**) Superimposition of bound nucleotides at binding subsite 6 derived from various bacterial and Lin28 CSDs in complex with ssDNA/ssRNA (PDB-IDs 4A76, 4A75, 3TS0, 3TS2, 3PF4, 2HAX). All structures contained T or U nucleotides at this binding pocket. A highly conserved Lys-Asp salt bridge limits the size of the pocket and establishes specific hydrogen bonds with the T/U base; (**D**) Since few interactions are formed with the sugar-phosphate backbone, the bound oligonucleotides can adopt different backbone conformations to optimize binding with Lin28 CSD. For example, at binding subsite 2, the sugar-phosphate backbone of mLin28a:preE-*let-7f* is farther displaced from the protein, thereby enabling binding of G (G-2) instead of T (T-2) without disrupting hydrogen bonds.

**Figure 7.** The pre-elements of *let-7* family members are structurally diverse. In six out of eleven human *let-7* family members, the conserved GGAG motif (blue) is inaccessible for ZKD binding in the lowest-energy folding state. Secondary structure predictions of human *let-7* family members (except miR-98 and miR-202) were calculated and visualized by CLC genomics workbench 3.65. All lowest-energy structures within a ΔΔG range of 1.5 kcal/mol are depicted. For simplicity, only 5 bp of the miRNA stem are shown (labeled in red).

#### **6. Summary and Conclusions**

Recent structural and biochemical studies along with genome-wide CLIP-seq studies have greatly improved our understanding of how Lin28 recognizes target RNAs and fulfills its pleiotropic functions related to regulation of miRNA and mRNA processing as well as mRNA translation. In the case of *let-7* biogenesis, Lin28 ZKD specifically recognizes a conserved GGAG motif within preE-*let-7* and induces a strong bending of the bound RNA backbone. Thus, the adjacent Dicer cleavage site remains constantly unwound and pre-*let-7* cannot be processed anymore. Apart from a minor preference for pyrimidine-rich sequences with one flanking guanosine, the CSD did not reveal any clear sequence specificity, but was able to remodel local RNA secondary structures. This might be important in three ways. First, initial binding of the CSD to single-stranded RNA sequences can induce a conformation in which the GGAG motif is accessible for subsequent ZKD binding. Second, the CSD might trigger structural changes within target RNAs thereby stimulating downstream processes such as pre-*let-7* oligo-uridylation. Third, the wide RNA-binding specificity of the CSD enables Lin28 to recognize all *let-7* family precursors in a defined 5'–3' orientation despite the low sequence conservation within *let-7* preEs. Consequently, Lin28 impairs *let-7* biogenesis and irreversibly targets pre-*let-7* to degradation.

On the mRNA level the combination of both domains enables Lin28a/b to bind thousands of mRNAs. Although there is still no consensus how Lin28 mRNA binding influences mRNA processing and regulates translation, the observed principles with respect to sequence specificity and RNA remodeling are also valid here. Lin28a/b recognize GGAG or GGAG-like motifs and can access these motifs even if they are embedded in predicted secondary structures. The CSD typically binds to uridine-rich regions upstream of the ZKD binding site and is responsible for Lin28's RNA chaperone-like function. The defined orientation of Lin28 on bound RNA might allow a specific recruitment of downstream factors, such as RNA helicase A. However, downstream effects of Lin28:mRNA binding were quite distinct in recent genome-wide CLIP-seq studies and comprised translational stimulation of growth-promoting and alternative splicing factors, as wells as translational repression of ER-destined mRNAs [19,49,52–54]. In addition to the localization of binding sites (coding sequence, 3'UTR), Lin28-induced structural changes within mRNAs and/or direct protein-protein interactions might affect the translation efficiency of target mRNAs.

Therefore, it will be essential to understand how Lin28a/b regulate mRNA processing and translation in more detail. Ongoing studies will need to determine which cellular factors are involved in these processes and which factors regulate the activity of Lin28a/b within different cellular compartments and cells/tissues. Furthermore, it remains to be verified which of the huge number of novel mRNA targets are indeed directly regulated by Lin28a/b and what are their impact on stem cell maintenance/differentiation, development, metabolism and cancer. Last but not least, additional structural and functional data of how Lin28 binds mRNA targets and interacts with downstream components such as RNA helicase A may help to elucidate the mechanisms behind Lin28-mediated translational enhancement. Although much effort has been undertaken to unravel the molecular mechanisms that control the Lin28/*let-7* regulatory axis, there are still a number of issues that remain to be solved. Which precise mechanisms does Lin28 use to inhibit pri-*let-7* processing by the Microprocessor in the nucleus? How does Lin28 stimulate TUT4/TUT7-mediated oligo-uridylation of pre-*let-7*? And is the observed RNA-chaperone-like function of Lin28 mandatory for fulfilling its tasks? Understanding these issues might help us to exploit Lin28's function and manipulate the involved pathways for improved tissue re-engineering and novel treatments of cancer or metabolic diseases.

#### **Acknowledgments**

The authors express their gratitude to Yvette Roske and Anja Schütz (Max-Delbrück-Center, Berlin) for their help with some of the structural studies covered in this review and helpful discussions.

#### **Conflict of Interest**

The authors declare no conflict of interest.

#### **References**




Reprinted from *IJMS*. Cite as: Doxakis, E. Principles of miRNA-Target Regulation in Metazoan Models. *Int. J. Mol. Sci.* **2013**, *14*, 16280-16302.

*Review*
