Identification of microRNAs Derived from Transposable Elements in the Macaca mulatta (Rhesus Monkey) Genome

Transposable elements (TEs) are mobile DNA entities that can move within the host genome. Over long periods of evolutionary time, TEs are typically silenced via the accumulation of mutations in the genome, ultimately resulting in their immobilization. However, they still play an important role in the host genome by acting as regulatory elements. They influence host transcription in various ways, one of which as the origin of the generation of microRNAs (miRNAs), which are so-called miRNAs derived from TEs (MDTEs). miRNAs are small non-coding RNAs that are involved in many biological processes by regulating gene expression at the post-transcriptional level. Here, we identified MDTEs in the Macaca mulatta (rhesus monkey) genome, which is phylogenetically close species to humans, based on the genome coordinates of miRNAs and TEs. The expression of 5 out of 17 MDTEs that were exclusively registered in M. mulatta from the miRBase database (v22) was examined via quantitative polymerase chain reaction (qPCR). Moreover, Gene Ontology analysis was performed to examine the functional implications of the putative target genes of the five MDTEs.


Introduction
microRNAs (miRNAs) are a class of small non-coding RNAs approximately 22 nucleotides in length [1,2].Typically, miRNAs inhibit translation or induce mRNA degradation by binding to the 3 untranslated region (3 UTR) of the target messenger RNA (mRNA), acting as post-transcriptional regulators.miRNAs play a key role in the regulation of gene function and are involved in a variety of biological processes such as cell proliferation, differentiation, and angiogenesis [3,4].In addition, dysregulation of miRNAs occurs during the progression of many diseases such as obesity and cancer [5,6].Generally, miRNAs are transcribed from miRNA genes via RNA polymerase II.Nonetheless, some miRNAs originate from transposable elements (TEs), known as miRNAs derived from TEs (MDTEs) [7,8].
TEs are mobile DNA entities composed of repetitive sequences that induce genetic diversity via insertion into the host genome [9,10].TEs are generally categorized into two classes based on their transposition intermediates [11].Class I TEs are retrotransposons that insert into the host genome via reverse transcription of the RNA intermediate using a copy-and-paste mechanism [11,12].Class I TEs are categorized into several types: long interspersed nuclear elements (LINEs), short interspersed nuclear elements (SINEs), and long terminal repeats (LTRs).Class II TEs are referred as DNA transposons that enter the host genome in the form of DNA segments using a cut-and-paste mechanism.Across extended periods of evolution, TEs tend to become quiescent due to the gradual accumulation of mutations within the genome, eventually leading to their immobilization.However, TEs continue to exert their effects on the host genome by functioning as regulatory elements.They influence host transcription in various ways, including serving as a source of miRNA generation.Some of the TEs have palindromic sequences that form miRNA hairpins, or the insertion of two similar TEs at adjacent positions within the genome can lead to the formation of the hairpin structures that could function as miRNAs [13][14][15].It has been revealed that MDTEs are incorporated into the RNA-Induced Silencing Complex (RISC) by binding to Argonaute (AGO) proteins, and regulate gene expression in the same way as the other non-TE-derived miRNAs [16,17].MDTEs can be divided into two categories based on the overlap between miRNAs and TEs.In the first case, TEs completely overlap with precursor miRNAs (pre-miRNAs).This results in the generation of two mature miR-NAs from a single pre-miRNA, both of which originate from the same TE.In the second case, TEs partially overlap with pre-miRNAs, causing one of the two mature miRNAs to share a complete sequence overlap with the TE [18][19][20].Although TEs are present in most prokaryotic and eukaryotic species, research on MDTEs remains limited [11].
In our previous study, we investigated MDTEs and their implication to human diseases.Herein, we extend our study on MDTEs to M. mulatta, one of the most extensively studied non-human primates (NHPs).M. mulatta, also known as the rhesus monkey or rhesus macaque, belongs to the subfamily Cercopithecinae of Old World monkeys and is native to South Asian countries [21,22].They are an essential animal model for human health and disease research because of their high genetic proximity to humans (showing approximately 93% genome identity) and similarities in organ function to humans [23,24].For these reasons, M. mulatta is considered a natural intermediate model that bridges the evolutionary and genetic gap between humans and the experimental rodents often used in human clinical trials [25][26][27].The use of NHPs in experiments has led to important advances in biology and medicine because they play a crucial role in testing the safety of new drugs in human clinical trials [28].In particular, M. mulatta made significant contributions during the COVID-19 pandemic as the most appropriate animal model for vaccine development [29].However, despite its great contributions as a disease model organism, there are unbridgeable differences in genome sequences between M. mulatta and humans.Understanding and studying these differences could aid in drug development and safety testing.Genome-based analysis revealed that the differences in the genomes of these two species are found in non-coding regions, such as UTRs, and are not in the highly conserved protein-coding regions [24].The sequence divergence of the UTRs (particularly within the 3 UTR) causes alterations in gene expression patterns by modifying the binding sites for gene expression regulators, including transcription factors and miRNAs, ultimately contributing to the evolutionary dynamics [30].Consequently, the sequence divergence also induces the expression alteration in these regulatory factors between two species [31].Therefore, studying the differences in miRNAs between M. mulatta and humans can help researchers understand how they contribute to species-specific traits and adaptations.It may also provide an opportunity to overcome the challenges that interspecies differences can pose to drug testing for human clinical trials.However, a few studies have been conducted on miRNAs in M. mulatta, but no study on MDTEs.
In this study, we confirmed MDTEs within the genome of M. mulatta by intersecting the genomic coordinates of TEs and miRNAs.In addition, we compared identified M. mulatta MDTEs to human MDTEs based on our previous study [32].As a result, we verified 68 common MDTEs between M. mulatta and humans and 17 MDTEs that were exclusively registered in M. mulatta based on the miRBase database (v22).We performed quantitative polymerase chain reaction (qPCR) to confirm the expression levels of 5 of 17 M. mulattaspecific MDTEs and conducted gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis to estimate their functional implications.Further studies are needed, but this research can contribute to the prediction and safety testing of novel therapeutics in preclinical trials by providing the first report of MDTEs that are exclusively present in the M. mulatta genome.

Identification of miRNAs Derived from Transposable Elements in M. mulatta Genome
To identify the genomic coordinates of M. mulatta miRNAs, the mml.gff3 file was downloaded from the miRBase v22 database (https://mirbase.org/(accessed on 9 March 2023)) and a total of 990 mature miRNAs were registered.Among them, miRNAs labelled 'JSUE', of which the chromosomal location is uncertain, were excluded.Additionally, the multi-copy miRNAs produced from different chromosomes but having the same mature miRNA sequences were counted only once.Considering these two conditions, a total of 907 mature miRNAs remained.Using BedTools [33], the chromosomal locations of mature miRNAs were joined (intersectBed with options wa and wb) with the RepeatMasker output (BCM Mmul_8.0.1/rheMac8) downloaded from the UCSC table browser (https://genome.ucsc.edu/cgi-bin/hgTables(accessed on 9 March 2023)).Following the definition of MDTEs established in previous studies, we selected mature miRNAs that were completely covered by TE sequences as MDTEs [8,18].Simple repeats and low-complexity elements were not considered in this study.According to the miRBase database (v22), five out of the 17 MDTEs identified only in M. mulatta were selected for further study, all with a read count of more than 100 from the deep sequencing results.The secondary structures and minimum free energy (MFE) of the pre-miRNAs were calculated using the RNAfold web server (http://rna.tbi.univie.ac.at/cgi-bin/RNAWebSuite/RNAfold.cgi(accessed on 12 March 2023)).

RNA Extraction and Complementary DNA (cDNA) Synthesis
Tissue samples from male and female M. mulatta were provided by the National Primate Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB).Total RNA was isolated using Hybrid-R TM (GeneAll, Seoul, Republic of Korea) according to the manufacturer's instructions.Total RNA was quantified for both concentration and purity using an ND1000 UV-Vis spectrophotometer (NanoDrop Technologies, Wilmington, NC, USA).Then, the reverse transcription of total RNA was performed using an HB miR Multi Assay Kit TM (SYSTEM I & SYSTEM II; HeimBiotek, Seoul, Republic of Korea) in accordance with the manufacturer's recommendations.The conditions for cDNA synthesis were as follows: incubation at 37 • C for 60 min (SYSTEM I) or 50 • C for 60 min (SYSTEM II), followed by incubation at 95 • C for 5 min and holding at 4 • C.

Quantitative Polymerase Chain Reaction (qPCR)
To examine the relative expression of miRNAs, qPCR was performed using the HB_I Real-Time PCR Master Mix Kit (HeimBiotek, Seoul, Republic of Korea).Specific primers for miRNA amplification were designed and synthesized by HeimBiotek, Inc.Small nuclear RNA (snRNA) U6 was used to normalize the relative miRNA expression, and the same amount of each cDNA sample was used for the experiment.The PCR was performed on a Quantstudio1 system (Applied Biosystems, Foster City, CA, USA) and the conditions were as follows: hold at 95 • C for 15 min for initial activation, followed by 40 thermal cycles at 95 • C for 10 s and 60 • C for 40 s; standard melting conditions with 90 s at 55 • C and then 5 s each at 1 • C increments between 55 • C and 99 • C. The ramp rate of the last transformation of 60 • C to 95 • C was set at 0.15 • C/s.All samples were analyzed in triplicate and the relative expression data were analyzed using the 2 −∆∆Ct method.

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Enrichment Analysis
Target gene prediction for the miRNAs was performed using TargetScan Human 8.0 (https://www.targetscan.org/vert_80/(accessed on 7 July 2023)), a web-based tool that computationally predicts miRNA targets.Predicted targets were required to have a total context ++ score of ≤−0.15 to reduce the number of false positives.Afterward, the selected target genes were subjected to GO and KEGG pathway analysis to examine the enrichment analysis using the web tool ShinyGO 0.80 (http://bioinformatics.sdstate.edu/go80/ (accessed on 3 October 2023)) with a false discovery rate (FDR) cut-off of 0.05.All tools were used with the species set to M. mulatta.

Statistical Analyses
Each experiment was conducted in triplicate, and the mean ± standard deviation (SD) of the data was plotted on a bar graph.

Identification of MDTEs in the M. mulatta Genome
The workflow of this study is illustrated in Figure 1.Briefly, we used the mml.gff3 file downloaded from the miRBase database (v22) to obtain the chromosomal coordinates of M. mulatta mature miRNAs and intersected them with the RepeatMasker output file containing the genomic coordinates of repeat sequences in the M. mulatta genome.Consequently, based on mature miRNAs, we confirmed the presence of 101 unique MDTEs in the M. mulatta genome.Detailed information on all the MDTEs is provided in Supplementary Table S1.Considering the total of 907 M. mulatta miRNAs registered in miRBase (v22), MDTEs accounted for approximately 11.1% of the total miRNAs (Figure 2a).Among the four TE classes, DNA transposons were most frequently responsible for MDTE generation, generating a total of 44 MDTEs.LINEs come second with 33 MDTEs, SINE with 19, and LTR with only five.At a more detailed level, the most abundant family of DNA transposons was TcMar-Mariner.For LINE, SINE, and LTR, it was the L2, MIR, and the ERVL-MaLR families, respectively.The detailed numbers of TE families constituting the MDTEs for each TE class are shown in Figure 2b and Supplementary Table S1.

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Enrichment Analysis
Target gene prediction for the miRNAs was performed using TargetScan Human 8.0 (https://www.targetscan.org/vert_80/(accessed on 7 July 2023)), a web-based tool that computationally predicts miRNA targets.Predicted targets were required to have a total context ++ score of ≤−0.15 to reduce the number of false positives.Afterward, the selected target genes were subjected to GO and KEGG pathway analysis to examine the enrichment analysis using the web tool ShinyGO 0.80 (http://bioinformatics.sdstate.edu/go80/(accessed on 3 October 2023)) with a false discovery rate (FDR) cut-off of 0.05.All tools were used with the species set to M. mulatta.

Statistical Analyses
Each experiment was conducted in triplicate, and the mean ± standard deviation (SD) of the data was plotted on a bar graph.

Identification of MDTEs in the M. mulatta Genome
The workflow of this study is illustrated in Figure 1.Briefly, we used the mml.gff3 file downloaded from the miRBase database (v22) to obtain the chromosomal coordinates of M. mulatta mature miRNAs and intersected them with the RepeatMasker output file containing the genomic coordinates of repeat sequences in the M. mulatta genome.Consequently, based on mature miRNAs, we confirmed the presence of 101 unique MDTEs in the M. mulatta genome.Detailed information on all the MDTEs is provided in Supplementary Table S1.Considering the total of 907 M. mulatta miRNAs registered in miRBase (v22), MDTEs accounted for approximately 11.1% of the total miRNAs (Figure 2a).Among the four TE classes, DNA transposons were most frequently responsible for MDTE generation, generating a total of 44 MDTEs.LINEs come second with 33 MDTEs, SINE with 19, and LTR with only five.At a more detailed level, the most abundant family of DNA transposons was TcMar-Mariner.For LINE, SINE, and LTR, it was the L2, MIR, and the ERVL-MaLR families, respectively.The detailed numbers of TE families constituting the MDTEs for each TE class are shown in Figure 2b and Supplementary Table S1.

Comparative Analysis of M. mulatta and Human MDTEs
In our previous investigation, we employed the same approach to identify MDTEs in the human genome and identified 352 miRNAs that were completely covered with TE sequences [34].Here, we compared our findings on 101 MDTEs from M. mulatta and 352 MDTEs from humans to show how similar and different they are.There were 68 shared MDTEs between the two genomes (Table 1).Among the TE classes of miRNAs' origin, the most overlapping class between the two genomes was DNA transposons, followed by LINEs, SINEs, and LTRs.Most miRNA origins were consistent with the level of TE names, but four miR-NAs (miR-582-5p, miR-6130, miR-1304-5p, and miR-558) have different TE name origins.For instance, miR-582-5p originates from CR1-3_Croc in the CR1 family of the LINE class in rhesus, but in humans, it originates from L3 in the CR1 family.
Among the remaining 30 miRNAs, 10 were present in both M. mulatta and humans but were only derived from TEs in M. mulatta.Among these, miR-5697-5p and miR-5697-3p are registered in the miRBase database only for M. mulatta and humans.miR-4796-3p and miR-892b have been registered in chimpanzee and Bornean orangutan, which seems primate specific.The remaining four miRNAs (miR-151-5p, miR-151-3p, miR-421, and miR-507) are conserved across species.Apart from the three miRNAs (miR-548i-3p, miR-7177-5p, and miR-7177-3p) which are not registered in humans but are present in other species such as horses and common marmosets, we found 17 MDTEs that were only identified in M. mulatta according to the miRBase database (v22) (Table 2).

Comparative Analysis of M. mulatta and Human MDTEs
In our previous investigation, we employed the same approach to identify MDTEs in the human genome and identified 352 miRNAs that were completely covered with TE sequences [34].Here, we compared our findings on 101 MDTEs from M. mulatta and 352 MDTEs from humans to show how similar and different they are.There were 68 shared MDTEs between the two genomes (Table 1).Among the TE classes of miRNAs' origin, the most overlapping class between the two genomes was DNA transposons, followed by LINEs, SINEs, and LTRs.Most miRNA origins were consistent with the level of TE names, but four miRNAs (miR-582-5p, miR-6130, miR-1304-5p, and miR-558) have different TE name origins.For instance, miR-582-5p originates from CR1-3_Croc in the CR1 family of the LINE class in rhesus, but in humans, it originates from L3 in the CR1 family.
Among the remaining 30 miRNAs, 10 were present in both M. mulatta and humans but were only derived from TEs in M. mulatta.Among these, miR-5697-5p and miR-5697-3p are registered in the miRBase database only for M. mulatta and humans.miR-4796-3p and miR-892b have been registered in chimpanzee and Bornean orangutan, which seems primate specific.The remaining four miRNAs (miR-151-5p, miR-151-3p, miR-421, and miR-507) are conserved across species.Apart from the three miRNAs (miR-548i-3p, miR-7177-5p, and miR-7177-3p) which are not registered in humans but are present in other species such as horses and common marmosets, we found 17 MDTEs that were only identified in M. mulatta according to the miRBase database (v22) (Table 2).

The Secondary Structures of Exclusively Existing MDTEs in M. mulatta
Since there is no experimental evidence or previous research on these 17 MDTEs, we examined their expression using quantitative polymerase chain reaction (qPCR).To narrow down the candidates for qPCR verification, criteria based on >100 read counts were used.As a result, five miRNAs (mml-miR-7163-5p, mml-miR-7168-3p, mml-miR-7194-5p, mml-miR-7174-5p, and mml-miR-7206-3p) were selected for further analysis.Before qPCR, the hairpin structures of the five MDTEs were predicted using pre-miRNA sequences obtained from the miRBase database (v22).All miRNAs exhibited a stem-loop configuration, as shown in Figure 3.In addition, minimum free energy (MFE) values were calculated to confirm the stability of the pre-miRNA structures.All putative pre-miRNA structures had MFE values lower than -17.90kcal/mol, with mml-mir-7206 being the most stable structure with the lowest MFE value of -32.40kcal/mol.This was followed by mml-mir-7168, mml-mir-7174, and mml-mir-7163, with mml-mir-7194 having the highest MFE value of −17.90 kcal/mol.

The Secondary Structures of Exclusively Existing MDTEs in M. mulatta
Since there is no experimental evidence or previous research on these 17 MDTEs, we examined their expression using quantitative polymerase chain reaction (qPCR).To narrow down the candidates for qPCR verification, criteria based on >100 read counts were used.As a result, five miRNAs (mml-miR-7163-5p, mml-miR-7168-3p, mml-miR-7194-5p, mml-miR-7174-5p, and mml-miR-7206-3p) were selected for further analysis.Before qPCR, the hairpin structures of the five MDTEs were predicted using pre-miRNA sequences obtained from the miRBase database (v22).All miRNAs exhibited a stem-loop configuration, as shown in Figure 3.In addition, minimum free energy (MFE) values were calculated to confirm the stability of the pre-miRNA structures.All putative pre-miRNA structures had MFE values lower than -17.90kcal/mol, with mml-mir-7206 being the most stable structure with the lowest MFE value of -32.40kcal/mol.This was followed by mml-mir-7168, mml-mir-7174, and mmlmir-7163, with mml-mir-7194 having the highest MFE value of −17.90 kcal/mol.

The Relative Expression Levels of MDTEs in M. mulatta Tissues
Given that no other study has validated the expression of these five MDTEs, we assessed their expression levels using qPCR in 14 tissues (e.g., bladder, cerebellum, cerebrum, heart, kidney, large intestine, liver, lung, ovary, pancreas, muscle, small intestine, spleen, and from a female M. mulatta and 13 tissues (e.g., bladder, cerebellum, cerebrum, heart, kidney, large intestine, liver, lung, pancreas, small intestine, spleen, stomach, and testis) from a male M. mulatta (Figure 4).

The Relative Expression Levels of MDTEs in M. mulatta Tissues
Given that no other study has validated the expression of these five MDTEs, we assessed their expression levels using qPCR in 14 tissues (e.g., bladder, cerebellum, cerebrum, heart, kidney, large intestine, liver, lung, ovary, pancreas, muscle, small intestine, spleen, and stomach) from a female M. mulatta and 13 tissues (e.g., bladder, cerebellum, cerebrum, heart, kidney, large intestine, liver, lung, pancreas, small intestine, spleen, stomach, and testis) from a male M. mulatta (Figure 4).Among the MDTEs, mml-miR-7206-3p was not universally expressed across all tissues, and its expression pattern varied between a female and male.mml-miR-7206-3p was not detected in the large intestine of a female, whereas it was absent in the kidney, large intestine, liver, small intestine, and stomach of a male.Notably, the only tissue in which mml-miR-7206-3p was not expressed in either sex was the large intestine.Additionally, its expression was upregulated in female muscle and male cerebrum tissue.
In contrast, other miRNAs (e.g., mml-miR-7163-5p, mml-miR-7168-3p, mml-miR-7174-5p, and mml-miR-7194-5p) were expressed in all tissues of both sexes, with distinct expression patterns observed between a female and male.The expression of mml-miR-7163-5p was particularly prominent in the kidneys of both sexes.While showing high expression in all male tissues other than the liver, mml-miR-7163-5p displayed lower expression in the female bladder, cerebellum, ovary, and small intestine.The expression of mml-miR-7168-3p was specific to the liver and muscle of a female M. mulatta.In a male, it exhibited widespread expression except in the bladder and was particularly pronounced in the pancreas and heart.The expression of mml-miR-7174-5p was significantly higher in the liver of female M. mulatta and in the kidney and stomach of male M. mulatta.The expression of mml-miR-7194-5p was significantly high in the lung of female M. mulatta.Additionally, mml-miR-7194-5p was highly expressed in the kidney and stomach of a male, resembling the expression patterns of mml-miR-7174-5p in male.

GO and KEGG Pathway Enrichment Analysis of Target Genes Regulated by Five MDTEs
To examine the functional implications of these five MDTEs, their putative target genes were identified in M. mulatta using TargetScan Human 8.0.The predicted target genes for each miRNA were selected based on the criteria of a total context++ score ≤ −0.15 (Supplementary Tables S2-S6).Subsequently, GO enrichment analysis was conducted to understand the functional and statistical significance of the target genes influenced by the miRNAs of interest using the ShinyGO 0.80 database (FDR cut-off 0.05) (Figure 5).The known function of genes was organized into three main categories: biological processes (BP), molecular functions (MF), and cellular components (CC), and each group was evaluated separately.
The predicted target genes regulated by mml-miR-7163-5p were involved in various BP.Many genes were associated with transport and some exhibited significant enrichment in the regulation of protein import into the nucleus and the regulation of protein import in BP.In terms of MF, five binding pathways were identified and notably enriched in neurotrophin TRK receptor binding and neurotrophin receptor binding, exhibiting an enrichment of over 20 fold.Additionally, various genes associated with cellular organelles, particularly the endoplasmic reticulum, influenced the nuclear outer membrane-endoplasmic reticulum membrane network and the endoplasmic reticulum membrane in CC.Among the MDTEs, mml-miR-7206-3p was not universally expressed across all tissues, and its expression pattern varied between a female and male.mml-miR-7206-3p was not detected in the large intestine of a female, whereas it was absent in the kidney, large intestine, liver, small intestine, and stomach of a male.Notably, the only tissue in which mml-miR-7206-3p was not expressed in either sex was the large intestine.Additionally, its expression was upregulated in female muscle and male cerebrum tissue.
In contrast, other miRNAs (e.g., mml-miR-7163-5p, mml-miR-7168-3p, mml-miR-7174-5p, and mml-miR-7194-5p) were expressed in all tissues of both sexes, with distinct expression patterns observed between a female and male.The expression of mml-miR-7163-5p was particularly prominent in the kidneys of both sexes.While showing high expression in all male tissues other than the liver, mml-miR-7163-5p displayed lower expression in the female bladder, cerebellum, ovary, and small intestine.The expression of mml-miR-7168-3p was specific to the liver and muscle of a female M. mulatta.In a male, it exhibited widespread expression except in the bladder and was particularly pronounced in the pancreas and heart.The expression of mml-miR-7174-5p was significantly higher in the liver of female M. mulatta and in the kidney and stomach of male M. mulatta.The expression of mml-miR-7194-5p was significantly high in the lung of female M. mulatta.Additionally, mml-miR-7194-5p was highly expressed in the kidney and stomach of a male, resembling the expression patterns of mml-miR-7174-5p in male.

GO and KEGG Pathway Enrichment Analysis of Target Genes Regulated by Five MDTEs
To examine the functional implications of these five MDTEs, their putative target genes were identified in M. mulatta using TargetScan Human 8.0.The predicted target genes for each miRNA were selected based on the criteria of a total context++ score ≤ −0.15 (Supplementary Tables S2-S6).Subsequently, GO enrichment analysis was conducted to understand the functional and statistical significance of the target genes influenced by the miRNAs of interest using the ShinyGO 0.80 database (FDR cut-off 0.05) (Figure 5).The known function of genes was organized into three main categories: biological processes (BP), molecular functions (MF), and cellular components (CC), and each group was evaluated separately.
The predicted target genes regulated by mml-miR-7163-5p were involved in various BP.Many genes were associated with transport and some exhibited significant enrichment in the regulation of protein import into the nucleus and the regulation of protein import in BP.In terms of MF, five binding pathways were identified and notably enriched in neurotrophin TRK receptor binding and neurotrophin receptor binding, exhibiting an enrichment of over 20 fold.Additionally, various genes associated with cellular organelles, particularly the endoplasmic reticulum, influenced the nuclear outer membrane-endoplasmic reticulum membrane network and the endoplasmic reticulum membrane in CC.The target genes potentially affected by mml-miR-7168-3p were involved in numerous pathways in each category.Concerning the BP, these genes were associated with processes such as the transmembrane receptor protein tyrosine kinase signaling pathway and the enzyme-linked receptor protein signaling pathway.In MF, the genes were linked to functions like growth factor receptor binding, kinase regulatory activity, and protein kinase regulator activity.As for CC, the top three components with the highest fold enrichment were related to neuronal structures, including the distal axon, axon, and postsynapse.
The putative target genes of mml-miR-7174-5p were highly associated with muscle structure development in BP and SNAP receptor activity was overrepresented in MF.In terms of CC, four components were enriched more than 3 fold, including the synaptobrevin 2-SNAP-25-syntaxin-1a complex, sorting endosome, CRD-mediated mRNA stability complex, and SNARE complex.
The genes potentially influenced by mml-miR-7194-5p were significantly overrepresented in the regulation of alternative mRNA splicing via splicesome by more than 7.5 fold in BP and phospholipase D activity by more than 40 fold in MF.Regarding CC, the genes were associated with three components and were significantly enriched in the protein acetyltransferase complex and acetyltransferase complex.
The target genes likely regulated by mml-miR-7206-3p were associated with BP and MF, but not CC.The genes showed a tendency to be involved in migration-related processes, such as the regulation of mononuclear cell migration, the positive regulation of leukocyte migration, and the regulation of leukocyte migration in BP.Moreover, in MF, only two functions were overrepresented: calcium-dependent protein serine/threonine phosphatase activity and calmodulin-dependent protein phosphatase activity.
After conducting GO enrichment analysis, KEGG analysis was performed on the potential target genes of the five MDTEs with a total context++ score ≤ −0.15, similar to the GO enrichment analysis, to understand the high-level functions of the biological system (FDR cut-off 0.05).However, the putative target genes of two MDTEs, mml-miR-7163-5p and mml-miR-7174-5p, were not significantly involved in the biological pathways that meet the FDR cut-off of 0.05.As a result, no analysis results were obtained for these two MDTEs.The target genes of the remaining three MDTEs were subjected to KEGG pathway analysis and the results are presented in Figure 6.Analysis of the candidate target genes regulated by mml-miR-7168-3p revealed their involvement in 21 pathways.Notably, the top two pathways with the highest fold enrichment were the glycosphingolipid biosynthesis-ganglio series and circadian rhythm.In particular, the glycosphingolipid biosynthesis-ganglio series pathway exhibited a remarkable 5-fold enrichment.For mml-miR-7194-5p, its putative target genes were associated with 14 pathways.The most enriched pathways, type II diabetes The target genes potentially affected by mml-miR-7168-3p were involved in numerous pathways in each category.Concerning the BP, these genes were associated with processes such as the transmembrane receptor protein tyrosine kinase signaling pathway and the enzyme-linked receptor protein signaling pathway.In MF, the genes were linked to functions like growth factor receptor binding, kinase regulatory activity, and protein kinase regulator activity.As for CC, the top three components with the highest fold enrichment were related to neuronal structures, including the distal axon, axon, and postsynapse.
The putative target genes of mml-miR-7174-5p were highly associated with muscle structure development in BP and SNAP receptor activity was overrepresented in MF.In terms of CC, four components were enriched more than 3 fold, including the synaptobrevin 2-SNAP-25-syntaxin-1a complex, sorting endosome, CRD-mediated mRNA stability complex, and SNARE complex.
The genes potentially influenced by mml-miR-7194-5p were significantly overrepresented in the regulation of alternative mRNA splicing via splicesome by more than 7.5 fold in BP and phospholipase D activity by more than 40 fold in MF.Regarding CC, the genes were associated with three components and were significantly enriched in the protein acetyltransferase complex and acetyltransferase complex.
The target genes likely regulated by mml-miR-7206-3p were associated with BP and MF, but not CC.The genes showed a tendency to be involved in migration-related processes, such as the regulation of mononuclear cell migration, the positive regulation of leukocyte migration, and the regulation of leukocyte migration in BP.Moreover, in MF, only two functions were overrepresented: calcium-dependent protein serine/threonine phosphatase activity and calmodulin-dependent protein phosphatase activity.
After conducting GO enrichment analysis, KEGG analysis was performed on the potential target genes of the five MDTEs with a total context++ score ≤ −0.15, similar to the GO enrichment analysis, to understand the high-level functions of the biological system (FDR cut-off 0.05).However, the putative target genes of two MDTEs, mml-miR-7163-5p and mml-miR-7174-5p, were not significantly involved in the biological pathways that meet the FDR cut-off of 0.05.As a result, no analysis results were obtained for these two MDTEs.The target genes of the remaining three MDTEs were subjected to KEGG pathway analysis and the results are presented in Figure 6.Analysis of the candidate target genes regulated by mml-miR-7168-3p revealed their involvement in 21 pathways.Notably, the top two pathways with the highest fold enrichment were the glycosphingolipid biosynthesisganglio series and circadian rhythm.In particular, the glycosphingolipid biosynthesisganglio series pathway exhibited a remarkable 5-fold enrichment.For mml-miR-7194-5p, its putative target genes were associated with 14 pathways.The most enriched pathways, type II diabetes mellitus and the adipocytokine signaling pathway, each showed over a 6-fold enrichment.KEGG analysis of the predicted target genes of mml-miR-7206-3p identified associations in only two pathways: Th17 cell differentiation and cellular senescence.
mellitus and the adipocytokine signaling pathway, each showed over a 6-fold enrichment.KEGG analysis of the predicted target genes of mml-miR-7206-3p identified associations in only two pathways: Th17 cell differentiation and cellular senescence.

Discussion
M. mulatta is one of the most widely used non-human primate species in medical and biological research due to its similarities in physiology and organ function to humans [25][26][27].To overcome the challenges that may arise in human clinical trials, it is important to understand the species-specific traits of the animal model, including the different gene expression responses regulated by differentially expressed miRNAs between M. mulatta and humans.Many miRNA gene families are evolutionarily conserved among mammalian species.It is considered that the same conserved miRNAs regulate similar pathways and biological processes among closely related species, such as primates [34,35].However, MDTEs are more likely to represent species-specific traits than non-TE-derived miRNAs for some reasons: the insertion patterns of TEs in the genome is different across species, and some TE families are species-specific.In our previous study, we identified MDTEs in the human genome and investigated the diseases associated with them [32].After classifying human MDTEs, we specifically focused on those derived from human endogenous retrovirus (HERV) sequences, which may represent human-specific properties.HERVs are a subfamily of LTR acquired from the human genome via exogenous retroviruses infection during primate evolution [36,37].There were 29 HERV-derived miRNAs, and among them, 23 MDTEs were only found in humans [38].Among the 23 HERV-derived miRNAs, hsa-miR-4454, which is derived from the HERV-H family, showed oncogenic traits by targeting the two

Discussion
M. mulatta is one of the most widely used non-human primate species in medical and biological research due to its similarities in physiology and organ function to humans [25][26][27].To overcome the challenges that may arise in human clinical trials, it is important to understand the species-specific traits of the animal model, including the different gene expression responses regulated by differentially expressed miRNAs between M. mulatta and humans.Many miRNA gene families are evolutionarily conserved among mammalian species.It is considered that the same conserved miRNAs regulate similar pathways and biological processes among closely related species, such as primates [34,35].However, MDTEs are more likely to represent species-specific traits than non-TE-derived miRNAs for some reasons: the insertion patterns of TEs in the genome is different across species, and some TE families are species-specific.In our previous study, we identified MDTEs in the human genome and investigated the diseases associated with them [32].After classifying human MDTEs, we specifically focused on those derived from human endogenous retrovirus (HERV) sequences, which may represent human-specific properties.HERVs are a subfamily of LTR acquired from the human genome via exogenous retroviruses infection during primate evolution [36,37].There were 29 HERV-derived miRNAs, and among them, 23 MDTEs were only found in humans [38].Among the 23 HERV-derived miRNAs, hsa-miR-4454, which is derived from the HERV-H family, showed oncogenic traits by targeting the two tumor suppressor genes, DNAJB4 and SASH1, in non-muscle-invasive bladder cancer (NMIBC).

Figure 1 .
Figure 1.Workflow depicting the steps for MDTE identification and analysis.Figure 1. Workflow depicting the steps for MDTE identification and analysis.

Figure 1 .
Figure 1.Workflow depicting the steps for MDTE identification and analysis.Figure 1. Workflow depicting the steps for MDTE identification and analysis.

Figure 2 .
Figure 2. MDTE composition in M. mulatta.(a) The ratio of MDTEs among all miRNAs and the distribution of TE classes contributing to miRNA origin; (b) the proportion of TE families in each TE class that generates miRNAs.

Figure 2 .
Figure 2. MDTE composition in M. mulatta.(a) The ratio of MDTEs among all miRNAs and the distribution of TE classes contributing to miRNA origin; (b) the proportion of TE families in each TE class that generates miRNAs.

Table 1 .
Common MDTEs between M. mulatta and humans.

Table 2 .
The 17 MDTEs exclusively registered in M. mulatta based on miRBase.

Table 2 .
The 17 MDTEs exclusively registered in M. mulatta based on miRBase.