*3.5. Embedded Alu Sequences Can Take Part in Alternative Splicing and A to I Editing in Human mRNAs*

As opposed to other embedded Alu sequences described in this manuscript, the association of Alus with alternate splicing and RNA editing represent nuclear processes that Alu sequences participate in. Alu sequences are found in exons in about 5% of alternatively spliced mRNAs [47]. The presence of an Alu in exons of pre-messenger RNA transcripts can provide alternative splice sites and parts of the embedded Alu sequences can be incorporated into the processed mRNA [48–51].

Approximately 45 percent of Alu sequences are found in introns in both 5' to 3' and reverse orientations and are present in multiple copies [47,52]. These Alu sequences can potentially form double-stranded stems within a transcript when two Alu RNAs are in antiparallel orientation [53]. This enables RNA editing to take place [54] and can lead to premature stop codons or changes in codon reading [47]. Alus are prominent targets for RNA editing [54]. This is an example of how Alu RNA secondary structure can participate in altering molecular processes.

### *3.6. Human Endogenous Retrovirus (HERV) LTR Transcripts*

About 8% of the human genome consists of human endogenous human retroviruses (HERV) [55]. HERVs cannot produce a viable virus due to mutations, but its associated LTR transposons serve a vital role in cell transcription. In addition, the human genome contains several thousand copies of single long terminal repeats (sLTRs), which originally stem from HERV [55]. These sLTRs carry no viral genes but can function as promoters and enhancers when found upstream of genes. However some sLTRs are situated in introns and are transcribed into RNA. Xu and co-workers, while studying the expression of HERV-9 U3 sLTR show that sLTR RNA transcripts are both sense and antisense RNAs, but the U3 sLTR antisense transcript can bind key transcription factors involved in cell proliferation. The sense sLTR RNA does not bind transcription factors. Importantly, malignant cells express lower levels of antisense sLTR RNA relative to sense transcripts than normal cells. The antisense sLTR RNA, which is ~550 nt appears to be a novel sLTR RNA species. The authors propose that the antisense sLTR lncRNA serves as a trap for some cell proliferation transcription factors. This may have significance in terms of a possible lack of inhibition of growth in cancer cells. Thus this is an example of a regulatory lncRNA that is encoded by a transposon (sLTR), but binds to and inactivates proteins and not other RNAs. Importantly, this ncRNA may play a crucial role in cell proliferation [55].

#### *3.7. Interrelatedness between HERV LTRs and Intergenic Long Non-Coding RNAs*

In a different study concerning HERV LTRs, other TEs and lncRNAs, Kelly and Rinn provide a comprehensive analysis of human TE sequences in long intergenic non-coding RNAs (lincRNAs) and conversely, the presence of lincRNAs sequences in transposons [56]. About 7700 lincRNAs overlap TEs and about 1530 lincRNAs are devoid of TEs; thus about 80 percent of human lincRNAs are associated with TEs. lincRNAs display a strikingly non-random association with transposable elements; the majority overlap human endogenous retrovirus (HERV) LTRs and a small minority are associated with LINE or SINE elements.

Interesting observations were made on the orientation of HERV transposons relative to lincRNAs and expression in specific cells. A large number of HERV LTRs are situated at the transcriptional start sites (TSS) of lincRNAs and in the sense orientation. This suggests that HERV LTRs provide regulatory signals for lincRNAs. lincRNAs display a marked stem cell specificity in expression, but lincRNAs that have no LTR associations are expressed highest in testes. On the other hand, lincRNAs that contain Alu sequences are expressed in all cell lines but testes. lincRNAs most likely function in specific tissues, but Alu-containing lincRNAs may be deleterious in testes, as they are not expressed in these tissues. Thus, there is a tissue-specificity in expression [56]. TEs may work "hand-in-hand" with lincRNAs as functional units in particular cells.

#### *3.8. Regulatory Non-Coding Circular RNAs*

Circular RNAs were first characterized in human and other mammalian cells about 20 years ago [57,58], however they were initially detected in electron micrographs over 30 years ago [59]. These RNAs consist of scrambled protein coding exons, *i.e.*, the order of exons is not the same as in the genomic sequence of protein coding regions. Scrambled exon sequences were discovered in RNA transcripts in rodents and humans [60]. Subsequently, additional cirRNAs were found [61–63]. Recently, by using deep sequencing of RNA techniques and a bioinformatics approach, Saltzman *et al.* [64] discovered several hundred circular RNAs in human cells and surprisingly, Jeck *et al.* [65] using circular enrichment techniques as well as bioinformatics determined that greater than 14% of human fibroblast gene transcripts are cirRNAs (over 25,000 circular transcripts).

Although circular RNAs arise from protein-coding regions, they do not encode proteins. They are thus a separate class of long non-coding RNAs. Functions of cirRNAs were not elucidated for over 20 years since their discovery. However, the field has now moved dramatically, with two laboratories determining that some circular RNAs serve as "sponges" that can bind approximately 70 microRNAs and thus inactivate the microRNAs [6,7]. This shows that circular RNAs have regulatory functions, *i.e.*, they "regulate the regulator", the microRNAs.

Via bioinformatics analyses, it was shown that Alu elements are found in upstream and downstream introns that straddle the exons that are circularized, and that Alu sequences tended to be inverted and thus complementary [65]. Intron pairing may contribute and be essentially to circularization of exons by complementary base-pairing between Alu elements in the upstream and downstream introns. If this is so, then Alu elements play a major role in formation of circularized RNAs. Related to this, there is precedent for Alu pairing in intons during alternative splicing [53].

#### *3.9. piRNAs—Known Regulators of TEs*

piRNAs are a class of small non-coding RNAs that are 26–31 nt. They interact with Piwi proteins, hence their name. The Piwi family is regulatory proteins that were originally defined in Drosophila as P-element induced wimpy testis [66]. piRNAs are abundantly found in germ line cells, especially in mammals, e.g., several million piRNAs are found in mammalian testes. Genetic regions that encode piRNAs consist of clusters. These clusters have repeats of piRNA sequences and there can be as many as 1000 copies of piRNAs in a cluster. piRNAs are processed from long precursors transcripts but little is known of the biogenesis of piRNAs and the number and functions of the associated proteins.

Some piRNA clusters consist of transposon or remnants of transposon sequences. Thus piRNAs can have sequences complementary to transposon sequences and can recognize their targets by base-pairing, either by perfect or imperfect base-pairing. A major role of the piRNA/Piwi protein complex in germ line cells is to protect cells from invading transposons. This is a type of "genetic immune system" that is found in both eukaryotes and prokaryotes. For example, the CRISPR complex in bacteria and archae functions by a comparable mechanism (albeit with significant variations on a theme) to protect cells from invasion by plasmids and viruses [67–69]. Both the piRNA/Piwi and CRISPR immune complexes function by an RNA-based mechanism.

piRNA functions have been studied in detail in *Drosophila*, *C. elegans* and mammalian cells. Functions are complex and may differ in different species, but a large fraction of piRNAs represent antisense transcripts to transposon transcripts [70–72]. A basic mechanism of action of piRNAs, first deduced in *Drosophila*, has the following scenario: when the cell is previously exposed to a TE but now experiences an overload of this transposon, piRNAs containing complementary sequences to the TE will base pair with and induce degradation of the TE RNA via the Piwi proteins. When the cell encounters a transposon that it has not been exposed to before, the TE by chance, may incorporate into the DNA in a piRNA-encoding cluster and thus its sequence can become part of the piRNA cluster (however, we are not aware that the probability of incorporation has been experimentally determined). Via the same mechanism mentioned above, piRNA transcripts that are antisense to the new transposon RNA induce degradation of the TE RNA via the piRNA–Piwi complex [71]. Thus, this is an immune system that helps keep transposons in check [70]. It is of interest that cell survival depends in part, on the probability of incorporation of the TE into a piRNA cluster, *vs.* the probability of insertion into and inactivation of an essential gene. However, other protective mechanisms also operate to limit TE activity.

Additional processes of piRNAs have been determined. In nematodes, piRNAs detect a TE sequence via imperfect base-pairing and then induce another small RNA class, termed 22G-RNAs to silence a transposon [73]. Some processes involve epigenetic mechanisms. For example, in Drosophila, nuclear piRNAs can target a transposon and thus direct Piwi proteins to repression chromatin and thus transcription of the TE [74]. Additionally, piRNAs may also induce the methylation of TE LINE-1 DNA in humans. This can prevent transcription of the transposon and thus assure that the TE DNA will remain dormant and not be expressed [75].

The piRNA/Piwi complex is also essential in genetic imprinting in the case involving DNA methylation of the imprinted locus Ras protein-specific guanine nucleotide-releasing factor 1 (Rasgfr1) locus in mouse germ line cells [76]. Mutants affecting piRNA expression correlate with defects in DNA methylation of Rasgrf1. The differentially methylated region (DMR) associated with Rasgfr1 contains a LINE1 retrotransposon and sequences consisting of 23–31 nt small RNAs; these correspond to piRNAs. Yet, a different locus on chromosome 7 also has a region that produces piRNAs that have a good match to the Rasgfr1 DMR piRNA sequences. The authors propose that piRNAs generated from chromosome 7 target the retrotransposon in the DMR of the imprinted Rasgrf1 locus and that chromosome 7 piRNAs may direct methylation of the Rasgrf1 locus [76]. Thus piRNA, in addition to being a post-transcriptional regulator may also be involved in epigenetic regulation of chromosomal genes and genetic imprinting.

#### *3.10. SINE/Alu Transcripts Function as ncRNAs in Gene Regulation at the Transcription Level*

The non-autonomous retrotransposon SINE sequences are transcribed into small ncRNAs, but like some piRNAs, they can function at the level of transcription. A SINE transcript termed B2, which is found in the nucleus of mice was shown to be 177 nt. This B2 RNA transcript is conserved in rodents. This RNA binds polymerase II during the heat shock response, disrupts the polymerase/promotor interaction and represses transcription from protein gene promoters [77–79]. It is unclear whether the ncRNAs also interact with the promoter sequence. The human counter part, an Alu RNA, functions in a similar manner during heat shock, even though it's nucleotide sequence and secondary structure differs from B2 RNA [47,77]. Expression of both these small ncRNAs is increased during the heat shock response.

In other studies, a processed human Alu RNA has a sequence that is identical to that of a piRNA present in mammalian testes [80]. The processed transcript, termed piAluRNA is found in the nucleus of human adult stem cells, appears to interact with several nuclear proteins and may be involved in several processes. These include transcription, chromatin organization, organelle organization, DNA repair and cell cycle control [80]. An RNA affinity assay with synthetic oligonucleotides representing a segment of the piAluRNA and high-resolution mass spectrometry-LC-MS were used to identify interacting proteins. Functional studies are needed, but the current binding data strongly suggest the involvement of piAluRNA in several of these nuclear functions. These studies may greatly extend the roles of small ncRNAs in cells.

It is important to point out that binding and repression of proteins by ncRNAs also occurs in prokaryotes. For example, the bacterial ncRNA 6S RNA regulates RNA polymerase by binding sigma factor 70 factor and subsequently repressing RNA polymerase activity from sigma 70 promoters [81–83]. Thus, this is another example of the basic principles of molecular regulation that encompass all biological kingdoms.

#### **4. Conclusions**

We presented examples of ncRNAs originating from TEs, such as miRNAs derived from MADE1 TE*.* In addition, there are piRNAs that consist of TE sequences and processed SINE and/or Alu transcripts that function as small ncRNAs. Some findings show TE-derived miRNAs to be less conserved than non-TE-derived miRNAs [21], which may imply a species-specific function of TE-derived ncRNAs. There is growing evidence for the meshing of TE sequences with ncRNAs involving both structure and function, and this association has resulted in formation of new regulatory pathways. It is obvious that TE transcripts are of an enormous asset to organisms, either as embedded sequences in ncRNAs or as individual RNAs. The interaction between TEs and ncRNAs could be looked at as a "symbiotic partnership" between the cell and transposable elements involving structure and function.

Cells offer TEs the means to multiple and maintain stability. Nevertheless, when active, TEs become overabundant and can become a threat to survival. The cell then elicits mechanisms to limit their replication. In this process, cells often use the TE sequences themselves to limit proliferation, as in the case with piRNAs involved in the genetic immune system [71].

Of special significance is that ~80% of long intergenic non-coding RNAs are associated with TEs, and in a nonrandom fashion [56]. They most likely serve functional roles, e.g., retrotransposon LTRs may provide regulatory signals for associated lincRNAs. This may be the tip of the iceberg in terms of TE/lincRNA functions as there are thousands of transcripts found in humans.

Most known ncRNA/target RNA interactions consist of short imperfect base-paired stems, and many miRNAs can bind to and regulate multiple target sequences. But this raises the question of the probability of making mistakes and targeting the wrong RNA. Other factors such as RNA-binding proteins also contribute to RNA recognition and stable formation, but the probability of mistakes must be very low, as imperfect RNA/RNA interactions appear to be highly specific. For example, the Alu-ncRNA/Alu-target RNA recognition must be as stable and specific as that of the intramolecular Watson–Crick base-paired stem recognized by Stau-1 in mRNA-induced degradation in human cells [29,30]. These stems are an interesting example of divergent RNA/RNA structures that bind the same RNA-binding protein. Three-dimensional RNP structures are needed to understand duplex conformations and protein binding sites on the two types of RNA stems.

Multiple ncRNA/target RNA pairings by the same ncRNA are also found in prokaryotic interactions with binding of ncRNAs to different target sequences and with different predicted base-pairings, e.g., see [84,85]. Thus stable short ncRNA/target RNA sequence pairings and multiple targeting are found in prokaryotes and eukaryotes. There is a beauty in the specificity and stability of small imperfect RNA/RNA pairings, and with employment of transposable elements in this binding process, and at least in eukaryotes, the cell appears to also use TE sequences to a significant extent in this form of binding.

It is of interest that some TE ncRNAs have been shown to bind proteins. For example, the lncRNA from an LTR retrotransposon situated in an intron binds transcription factors involved in cell proliferation [55]. The B2 ncRNA from an Alu transcript binds polymerase II [47]. The piAlu RNA binds nuclear proteins [80]. Thus transposon RNA transcripts show versatility in function in that these can also repress nuclear protein functions. The 6S RNA in bacteria, although not a TE transcript, is an example of a prokaryotic ncRNA that binds and inhibits the bacterial polymerase enzyme. This adds to the universality of ncRNA-related regulatory mechanisms in biological species. The 6S RNA was the first ncRNA to be sequenced [86], albeit its function was not determined until ~30 years later [81]!

The LINE-1/Alu element in a human lncRNA plays a pivotal role in formation of disease. A single mutation in the embedded TE causes human brainstem atrophy [39]. Whether the embedded LINE-1/Alu element is more prone to mutation than the rest of the lncRNA is not known, but this shows that a point mutation in a TE is the cause of atrophy and death and not a mutation in a protein gene. This adds to the spectrum of mutations that cause human disease, *i.e.*, non-protein-coding genomic sequences can be important factors in human disease. Related to this, the deregulation of lncRNA transcription in diseases such as cancer has recently been highlighted [87,88]. As there are thousands of ncRNAs associated with TEs whose functions have not been determined, the future may possibly hold some interesting surprises with respect to diseases that may have an aberrant ncRNA etiology.

#### **Acknowledgements**

We thank Kiran Kumar Govindaiah for aid with the figure drawings, and Drs. Stefanie Mortimer and Jennifer Doudna for providing the Alu secondary structure drawing. We also are grateful to Dr. Davide Gabellini for clarification of the evolution of and molecular mechanisms pertaining to the FSDH repeat region. ND acknowledges support from the Department of Molecular Genetics and Microbiology, Stony Brook University.

#### **Conflict of Interest**

The authors declare no conflict of interest.

#### **References**


Reprinted from *IJMS*. Cite as: Hou, J.; Zhao, J. MicroRNA Regulation in Renal Pathophysiology. *Int. J. Mol. Sci.* **2013**, *14*, 13078-13092.

*Review*
