Perspective for Studying the Relationship of miRNAs with Transposable Elements

Transposable elements are important sources of miRNA, long non-coding RNAs genes, and their targets in the composition of protein-coding genes in plants and animals. Therefore, the detection of expression levels of specific non-coding RNAs in various tissues and cells in normal and pathological conditions may indicate a programmed pattern of transposable elements’ activation. This reflects the species-specific composition and distribution of transposable elements in genomes, which underlie gene regulation in every cell division, including during aging. TEs’ expression is also regulated by epigenetic factors (DNA methylation, histone modifications), SIRT6, cytidine deaminases APOBEC3, APOBEC1, and other catalytic proteins, such as ERCC, TREX1, RB1, HELLS, and MEGP2. In evolution, protein-coding genes and their regulatory elements are derived from transposons. As part of non-coding regions and introns of genes, they are sensors for transcriptional and post-transcriptional control of expression, using miRNAs and long non-coding RNAs, that arose from transposable elements in evolution. Methods (Orbld, ncRNAclassifier) and databases have been created for determining the occurrence of miRNAs from transposable elements in plants (PlanTE-MIR DB, PlaNC-TE), which can be used to design epigenetic gene networks in ontogenesis. Based on the data accumulated in the scientific literature, the presence of 467 transposon-derived miRNA genes in the human genome has been reliably established. It was proposed to create an updated and controlled online bioinformatics database of miRNAs derived from transposable elements in healthy individuals, as well as expression changes of these miRNAs during aging and various diseases, such as cancer and difficult-to-treat diseases. The use of the information obtained can open new horizons in the management of tissue and organ differentiation to aging slow down. In addition, the created database could become the basis for clarifying the mechanisms of pathogenesis of various diseases (imbalance in the activity of transposable elements, reflected in changes in the expression of miRNAs) and designing their targeted therapy using specific miRNAs as targets. This article provides examples of the detection of transposable elements-derived miRNAs involved in the development of specific malignant neoplasms, aging, and idiopathic pulmonary fibrosis.


Introduction
Multicellular eukaryotes are characterized by the expression of a wide variety of noncoding RNAs (ncRNAs), the number of which is several times higher than the number of protein-coding genes [1]. In evolution, the emergence of ncRNAs was due to the protection of genomes from the expression of transposable elements using the RNA interference system (RNAi). This system includes the enzymes Dicer (ribonuclease III), RNA-dependent RNA polymerase (RdRP), and Argonaute-PIWI [2]. These enzymes process transposable elements' (TEs) transcripts with the formation of small ncRNAs, which are then used by In addition to miRNAs, noncoding RNAs also include tRNAs (73-93 nucleotides), small nuclear RNAs (150 nucleotides, denoted by the letter U, participate in splicing), small nucleolar RNAs (60-170 nucleotides, necessary for processing ribosomal RNA), Vault-RNA (100 nucleotides, regulate autophagy and apoptosis), Y-RNA (about 100 nucleotides, bind to the Ro60 protein), small NF90-associated RNAs (snaRs-117 nucleotides, involved in translation control), ribosomal RNAs (rRNAs), long non-coding RNAs (more than 200 nucleotides in size), and circular RNA (formed during splicing from exons or introns of mRNA genes) [5]. A common class of ncRNAs are small interfering RNAs (siRNAs) which are generated by the degradation of exogenous dsRNAs, transcribed from TEs or from other types of inverted repeats [2]. Animals are also characterized by the class of piRNAs, 21-35 nucleotides long, which are involved in the regulation of gene expression, antiviral response, and TEs silencing (by targeting histone modifications and DNA methylation) [15]. In humans, ncRNA designations are approved by the HUGO Gene Nomenclature Committee (HGNC). For each gene, www.genenames.org (accessed on 20 December 2022) provides information on its symbol, name, chromosomal localization, and links to key resources, such as Ensembl, NCBI Gene, and GeneCards [5]. The study of the relationship of TEs with miRNAs is promising for the use of ncRNAs as tools for correcting TEs dysregulation during aging, in cancer, and in various idiopathic diseases. To implement this, it is necessary to create an extensive, replenished universal online database that allows you to identify their relationship.

Differences of the Origin of miRNA from Transposons in Plants from Animals
Unlike in animals, plant miRNAs are completely complementary to their target mRNA sequences. Their binding, in most cases, causes mRNA cleavage [2]. Moreover, mRNAs can contain several regions that are complementary to miRNAs. Both stages of miRNA precursor cleavage are carried out in the nucleus using ribonuclease DCL1, after which the miRNA is transported into the cytoplasm by means of Hasty enzyme, which is homologous to the animal Exportin-5 protein [16]. Plants are characterized by a significant variety and number of specific small ncRNAs, which include tasiRNA (trans-acting short interfering RNA), nat-siRNA (natural antisense short interfering RNA), and hc-siRNA (heterochromatic small interfering RNA) [14]. TEs in plants during evolution become sources of both miRNA genes and protein-coding genes' exons. Due to these processes, epigenetic regulators (miRNAs) and their targets (gene exons) are formed, and transposable elements (miRNA sources) form dynamic gene networks that control protein-coding genes expression. One of the mechanisms by which miRNAs originate from TEs is the formation of inverted repeats, which are transcribed into RNA hairpin structures processed by Dicer-like enzymes [17]. TE-derived miRNAs (TEDmiR) are involved in vital functions, such as stress responses, a barrier to hybridization in plants, and dynamic transformations of heterochromatin during ontogenesis [3].
For the first time in the world in 2007, a study of rice TEs revealed 21 different small ncRNAs formed from MITE (miniature inverted-repeat transposable elements), which are localized in introns and exons of protein-coding genes, EST regions (expressed sequence tag), and intergenic [18]. A total of 12 TEDmiRs in Arabidopsis and 83 TEDmiRs in rice, which also derived from MITE, were described the next year [19]. An analysis of the miRBase and Repbase Update allowed Lorenzetti et al. to create an online resource (http://bioinfotool.cp.utfpr.edu.br/plantemirdb; accessed on 20 December 2022) for the registration of miRNAs derived from TEs-PlanTE-MIR DB [11]. The main sources of miRNAs in plants are LTR-Res since they constitute the bulk of their genomes. For example, LTR-REs in Asparagus officinalis occupy 91% of the total DNA, in Hordeum vulgare-76%, in Allium cepa-58%, in Zea mays-55% [20]. In 2018, an article was published on the creation of the PlaNC-TE database (http://planc-te.cp.utfpr.edu.br; accessed on 20 December 2022), according to which, in 40 plant genomes, 14350 miRNAs originated from TEs [21].
The emergence of miRNAs from transposable elements is an important adaptive mechanism of plants, which is necessary for their survival. In the wheat genome, TEs occupy 85% of all nucleotides. Of these, the most prone to domestication into miRNA precursors are MITE. This mechanism of the miRNA generation plays a role in the development of wheat immune responses. Of the 48 miRNA families, 16 have been shown to be derived from TEs [22]. In different rice species, the miR812 family, derived from MITEs, is involved in immunity against fungal infections. These mechanisms involve many genes (such as ACO3, CIPK10, LRR) in the 3 -or 5 -UTR, of which MITEs are located [23]. In the tissues of Arabidopsis sporophytes, small RNAs of 21-22 nucleotides in length were identified, which are transcribed by RNA polymerase-IV from TEs genes. These ncRNAs were involved in the regulation of many plant genes [24]. In 2020, Marakli described 17 new TEDmiRs (in addition to those previously described in PlanTE-MIR DB) that are involved in purine, nitrogen metabolism, oxidative phosphorylation, and other critical plant functions [25]. The appearance of such articles makes it possible to expand the understanding of the role of TEs in the emergence of miRNAs and to create more global database systems for determining the mechanisms of epigenetic regulation of plant and animal ontogenesis.

The Role of Transposons in the Emergence of miRNA in Animals
In animals, the main majority of all TEs are non-LTR retroelements (LINE and SINE). In humans, they occupy 35% of all DNA, in mice-28%, in Drosophila-17% [20]. miRNA precursor maturation (by cutting out a parts) is caused by specific enzymes; it initially occurs in the cell nucleus with the help of Drosha ribonuclease-III, after which the RNA is transported into the cytoplasm (with the help of Exportin-5) where it is acted upon by the DICER enzyme [16]. In animals, miRNAs interact with the 3 -UTRs of target mRNAs through partial base pairing (nucleotides 2 to 7 of miRNAs). Binding and interaction of microRNAs with the 3 -UTR lead to the repression of gene expression [2]. 3 -UTR of genes are characterized by the presence of TE residues in them, which form a mutually regulatory system since they become targets for miRNAs derived from their related TEs. As a result, a complex regulatory epigenetic gene network is formed that controls the development of the body (Figure 2) [26].
Evidence for the emergence of miRNAs from TEs in animals has been obtained in numerous studies. For the first time, back in 2005, Smalheiser and Torvik described a model for the formation of miRNAs from TEs sequences in mice, rats, and humans through the formation of hairpin DNA structures between two TEs [27]. In 2006, the results of the analysis of a miRNA cluster on human chromosome 19 were presented, according to which miRNAs are dispersed among Alu-retroelements (referred to as SINE). At least 30 different miRNAs were found to be complementary to Alu [28]. In 2007, 55 different TEDmiRs were described in humans [29]. In 2009, data on 7 TEDmiRs were published in the marsupial Monodelphis domestiva [30]. In the same year, 73 miRNAs transcribed from Alu or MIR in humans were characterized using computer modeling. The role of TEs was shown not only as sources of miRNAs, but also as regulators of their expression in time and space during the development of the organism. Retroelements Alu serve as the basis for the transcriptional regulation of certain miRNA genes [10]. Similar data were obtained in the study of piRNA and miRNA derived from TEs at the early stages of embryonic development. These ncRNAs affected the mRNA of genes involved in key pathways in the regulation of embryogenesis (including the Wnt and TGF-β genes) [31].
transported into the cytoplasm (with the help of Exportin-5) where it is acted upon by the DICER enzyme [16]. In animals, miRNAs interact with the 3′-UTRs of target mRNAs through partial base pairing (nucleotides 2 to 7 of miRNAs). Binding and interaction of microRNAs with the 3′-UTR lead to the repression of gene expression [2]. 3′-UTR of genes are characterized by the presence of TE residues in them, which form a mutually regulatory system since they become targets for miRNAs derived from their related TEs. As a result, a complex regulatory epigenetic gene network is formed that controls the development of the body (Figure 2) [26]. Evidence for the emergence of miRNAs from TEs in animals has been obtained in numerous studies. For the first time, back in 2005, Smalheiser and Torvik described a model for the formation of miRNAs from TEs sequences in mice, rats, and humans through the formation of hairpin DNA structures between two TEs [27]. In 2006, the results of the analysis of a miRNA cluster on human chromosome 19 were presented, according to which miRNAs are dispersed among Alu-retroelements (referred to as SINE). At least 30 different miRNAs were found to be complementary to Alu [28]. In 2007, 55 different TEDmiRs were described in humans [29]. In 2009, data on 7 TEDmiRs were published in the marsupial Monodelphis domestiva [30]. In the same year, 73 miRNAs transcribed from Alu or MIR in humans were characterized using computer modeling. The role of TEs was shown not only as sources of miRNAs, but also as regulators of their expression in time and space during the development of the organism. Retroelements Alu serve as the basis for the transcriptional regulation of certain miRNA genes [10]. Similar data were obtained in the study of piRNA and miRNA derived from TEs at the early stages of embryonic development. These ncRNAs affected the mRNA of genes involved in key pathways in the regulation of embryogenesis (including the Wnt and TGF-β genes) [31].
In 2010, data on the miR-1302 family, derived from MER53 transposons in humans, were published [32]. In 2011, an article was published about 226 TEDmiR in humans, 141 In 2010, data on the miR-1302 family, derived from MER53 transposons in humans, were published [32]. In 2011, an article was published about 226 TEDmiR in humans, 141 TEDmiR in mice, and 115 TEDmiR in rhesus monkeys. The authors noted a speciesspecific expansion of miRNA families, associated with evolutionary transpositions of certain TEs, with large segmental duplications of genomic loci [33]. In 2011, the results of the analysis of more than 15176 individual miRNAs in different animal species were described with the identification of 2392 TEDmiRs [34]. In the same year, a new approach was described for identifying miRNA targets. For this approach, the authors used the analysis of transcripts containing TEs-miRNA precursors. The method was named Orbld (Originbased identification of miRNA targets). It helped identify targets for 191 TEDmiRs [35]. In 2012, the origin of 182 miRNAs, 788 siRNAs, and 4990 piRNAs from TEs was described in the silkworm [36]. In the same year, the mapping of all miRNA precursors from the miRBASE database, with the determination of the repetitive elements of the genomes overlapping these regions, was reported. The ncRNAclassifier method was developed to classify pre-ncRNAs arising from TEs, and 235 human TEDmiRs and 68 mouse TEDmiRs were described [37].
In 2013, genes of 1213 miRNAs in different eukaryotic genomes were studied, of which 1007 (83%) were derived from various TEs (467 from DNA transposons, 235 from LTR-RE, 186 from LINE, 119 from SINE). They identified primate-specific expansions in the miR-151, -378, -6130, -6127, -1260, -548, -4536, and -1273, including 45 human loci [38]. In 2014, using the RepeatMasker program, the GENCODE v.19 database was analyzed and 1900 TEDmiRs were discovered, of which 406 were previously described by other authors [39]. In bats, unlike other animals, DNA-transposons have the highest activity, in which are rich sources of most TEDmiRs. Among all miRNAs, TEDmiRs in bats account for 61%, which is a significant proportion compared to dogs (24%) and horses (17%) [40]. In 2016, data on the detection of 409 TEDmiRs in humans were published [41]. In 2016, an attempt was made to create an MDTE database (miRNAs derived from TEs) of miRNAs derived directly from TEs. Database address: http://bioinf.njnu.edu.cn/MDTE/MDTE.php (accessed on 20 December 2022). This database describes 1251 miRNAs derived from 30 TEs families in humans and 6 animal species (bull, house mouse, chicken, rhesus monkey, common chimpanzee, gray rat) [12]. However, at present, this database is not available, which indicates the relevance of creating a universal online database of miRNAs derived from TEs.

Prospects for the Creation of a Human Transposable Elements-Derived miRNA Database
It can be assumed that the majority of animal miRNAs evolved from TEs since transposons are characterized by high mutability during domestication, which causes difficulties in determining the belonging of TEs sequences [42]. Of greatest interest is the study of TED-miRs in humans, which are associated with severe diseases, since this will reveal the key pathways of disease pathogenesis and, in the future, design targeted methods using ncR-NAs targeting TEs. For example, in 2020, a bioinformatic analysis was carried out to find such TEDmiRs using the TransmiR v.2.0 database. A total of 51 specific miRNAs derived from TEs were identified, of which 34 are associated with various human pathologies [43]. Indeed, miRNAs are promising targets for the targeted therapy of various diseases, which is especially important for malignant neoplasms and idiopathic diseases (when the etiology and pathogenesis have not yet been established). In this regard, on the basis of the data presented in the scientific literature by various authors [12,29,32,33,35,37,39,41,43], data were collected on 467 miRNAs derived from transposable elements (Table 1). miRNAs are used to predict tumor formation and outcome. For this, appropriate bioinformation systems are used, such as OncomiR, an online resource for changes in miRNA regulation in malignant neoplasms, which is freely available at www.oncomir.org (accessed on 20 December 2022) [44]. Analysis of this resource using 467 miRNAs derived from transposable (Table 1) allowed me to identify 52 TEDmiRs, in which changes in the expression are characteristic of specific types of malignant tumors [44]. In order to find aging-associated microRNAs derived from transposons, a search was made for the association of 52 TEDmiRs, associated with cancer, with aging in the databases Scopus, WoS, and NCBI. I introduced phrases of specific miRNAs with the words "aging", "change with age", "senescence", and "consenescence" into the search line.
Pathological activation of TEs is characteristic of both human aging and the development of malignant neoplasms, while aging is a risk factor for most types of cancer [45]. Therefore, the scientific literature was analyzed to search for an association with aging of the 52 TEDmiRs, the expression of which changes in malignant neoplasms. This would allow finding common epigenetic relationships between cancer and aging. In the long term, the results obtained could become the basis for a targeted effect on the mechanisms of aging in order to prevent the development of cancer. In my search, 16 of the 52 TEDmiRs (miR-151a, miR-192, miR-211, miR-28, miR-31, miR-335, miR-340, miR-378a, miR-450b, miR-487b, miR-495, miR-511, miR-576, miR-585, miR-708, miR-885) analyzed were found to be associated with aging (Table 1). Aging is characterized by a significant decrease in the level of miR-151a in the blood of healthy people [46], while the expression of miR-192 in the kidneys is significantly increased [47]. Comparison of centenarians with people from families with low life expectancy revealed a significant increase in miR-211 expression in centenarians, which was proposed to be used as a biomarker of aging [48]. A significant decrease in the level of miR-28 expression has been shown in the elderly [49]. Increased expression of miR-31 was revealed during replicative aging [50]. This miRNA is a target of histone deacetylators in both malignant neoplasms and aging [51]. The role of miR-335 was identified in human aging and in age-related neurological diseases [52]. Quantitative transcriptional reverse PCR analysis reveals the role of miR-340 in aging [53]. Estrogensensitive miR-378a is involved in the aging mechanisms of the human thymus, as confirmed in experiments on mice [54]. Disruption of miR-450b regulation in cellular senescence, caused by endogenous genotoxic stress, was found [55]. The involvement of miR-487b in the aging of skeletal muscle has been determined [56]. MiR-495 induces senescence of mesenchymal stem cells [57], and expression of miR-511 changes during aging of the nervous system [58].
More enrichment of miR-576 was found in blood plasma exosomes of the elderly compared with young people [59]. miR-487b can be used as a target for targeted therapy of aging-related muscle atrophy, which directly interacts with the long ncRNA MAR1 (muscle anabolic regulator 1) [56]. Oxidative stress contributes to aging and the development of cardiovascular and neurodegenerative diseases. It was found that miR-585 regulates the PARP-1 gene (poly-(ADP-ribose) polymerase 1), the product of which is involved in the repair of oxidatively damaged DNA. Overexpression of this miRNA increases DNA damage and suppresses cell survival [60]. As a result of the study of miRNA expression in Parkinson's disease, it was proposed to use miR-885 as a biomarker of human aging and cellular senescence [61]. Experiments on mice have shown the role of miR-450b in aging [55], as well as a decrease in miR-511 expression during aging [58]. The study of 521 different miRNAs in 6 strains of mice with different average lifespans revealed a significant association of three miRNAs, including miR-708 [62], whose expression changes in specific human cancers [44].
Regulation of the expression of TEs is possible by means of miRNAs (derived from TEs since they are complementary to their sequences) at the transcriptional level. This is possible due to the phenomenon of RdDM [7]. This will increase the lifespan of people since the pathological expression of TEs is the cause of aging [79]. In addition, the use of miRNAs, complementary to specific TEs, will allow regulating their activity in cancer treatment since the role of pathological activation of TEs in carcinogenesis has been proven [80,81]. It is important to note that Tes are sources of lncRNAs that can serve as pri-miRNAs, with the ability to be translated on ribosomes to form peptides and be processed into miRNAs. Moreover, both formed functional molecules are characterized by participation in the same biological reactions. This indicates the importance of studying the relationship of TEs with lncRNAs and miRNAs. For example, lncRNA MIR22HG (activated in response to chemical stress) is transcribed into pri-miRNA-22, which is translated into a 9 kDa peptide involved in the antiviral response [82]. lncRNA MIR497HG is transcribed into pri-miR-497, which is further processed into two mature miRNAs: miR-497 and miR-195. At the same time, pri-miR-497 is translated into miPEP497 with an oncosuppressive function [83].

Conclusions
The identification of miRNAs derived from TEs is the basis for determining the regulatory mechanisms through which transposons exert global control over the functioning of genomes. It will allow designing possible ways of influencing physiological and pathological processes in the body, which is promising for the development of modern genetics and medicine. Therefore, it is necessary to create a universal, replenished online database of transposon-derived miRNAs. The scientific literature was analyzed and 467 specific transposon-derived miRNAs, which could form the basis for creating such an online database, were found. The analysis of the data presented in Table 1 made it possible to determine 52 different miRNAs derived from transposons, which are associated with specific malignant neoplasms. Moreover, it was found that 16 of these 52 miRNAs (miR-151a, miR-192, miR-211, miR-28, miR-31, miR-335, miR-340, miR-378a, miR-450b, miR-487b, miR-495, miR-511, miR-576, miR-585, miR-708, miR-885) are also associated with aging, 9 are associated with idiopathic pulmonary fibrosis, and 6 of them (miR-708, miR-495, miR-487b, miR-340, miR-335, miR-31) are associated with both malignant neoplasms and aging. Since TEs are involved in the global regulation of various body functions, my results can be further used to develop diagnostic algorithms for the diagnosis and targeted therapy of aging-associated diseases, such as malignant neoplasms and idiopathic pulmonary fibrosis. miRNAs derived from transposable elements or oligonucleotides antisense, as well as specific peptides formed during translation of pri-miRNAs, can be used as tools for such targeted therapy.
An analysis of the results presented in the table on the origin of miRNAs from transposons in humans showed that miRNAs are most often formed from LINE elements (108 miRNAs) and SINE elements (94 miRNAs) and less often from DNA transposons (64 miRNAs) and LTR-containing retroelements (53 miRNAs). Since, according to the results of the study (Table 1), the main sources of microRNAs in humans are LINE elements, we analyzed the scientific literature on the role of LINEs in the regulation of embryonic development, in which microRNAs play an important role [84][85][86]. In 2000, Wei et al. described the accumulation of multiple LINE1 insertions in human cell cultures [87]. In 2007, Garcia-Perez et al. revealed the accumulation of LINE1 insertions in human embryonic stem cells, which was accompanied by the suppression of the activity of specific genes required for cell differentiation. On the basis of the obtained data, the researchers suggested that TEs control the work of the genome during the growth and development of organisms [88]. Upon activation of LINE1, their proteins are used to mobilize SINEs. In 2011, Macia et al. reported the expression of several subfamilies of Alu elements in undifferentiated human embryonic stem cells. At the same time, activation of LINE1, located within protein-coding genes which indicates their role in the regulation of these genes, was mainly detected [89]. In addition to tissue cultures, consistent transpositions and activation of LINE1, Alu, and SVA have been identified in vivo during early embryogenesis, during tissue differentiation. These changes caused large-scale structural variations in the genomes of experimental animals. In 2004, Prak et al. showed, in transgenic mouse models, that LINE1 can move in vivo during early development [90]. In 2012, organ-specific and stage-specific changes in cell phenotypes were identified in the C57BL/6J mouse line due to structural transformations of their genomes, which were accompanied by changes in the transcriptional activity of certain ERs [91]. Experiments on two-celled mouse embryos have shown that LINE1 is required for the activation of global gene expression during early embryonic development [92]. LINE1 transcripts themselves function as lncRNAs, interacting with KAP1 and Nucleolin, stimulating rDNA gene expression and silencing other genes in a two-cell embryo by suppressing Dux (a transcription factor that controls the two-cell genetic program) [93].

Conflicts of Interest:
The authors declare no conflict of interest.