Next Article in Journal
Primitive Cutaneous (P)erivascular (E)pithelioid (C)ell Tumour (PEComa): A New Case Report of a Rare Cutaneous Tumor
Previous Article in Journal
Integrated Analyses of DNA Methylation and Gene Expression of Rainbow Trout Muscle under Variable Ploidy and Muscle Atrophy Conditions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Structural Analysis of microRNAs in Myeloid Cancer Reveals Consensus Motifs

1
Faculty of Physics and Earth Sciences, Peter Debye Institute, Leipzig University, 04103 Leipzig, Germany
2
Institute of Molecular and Cell Physiology, Hannover Medical School, Carl-Neuberg-Straße 1, 30625 Hannover, Germany
3
Excellence Cluster Cardiopulmonary System, University of Giessen and Marburg Lung Center (UGMLC), Justus-Liebig-University, 35392 Giessen, Germany
*
Author to whom correspondence should be addressed.
Genes 2022, 13(7), 1152; https://doi.org/10.3390/genes13071152
Submission received: 15 May 2022 / Revised: 19 June 2022 / Accepted: 24 June 2022 / Published: 26 June 2022
(This article belongs to the Section RNA)

Abstract

:
MicroRNAs (miRNAs) are short non-coding RNAs that function in post-transcriptional gene silencing and mRNA regulation. Although the number of nucleotides of miRNAs ranges from 17 to 27, they are mostly made up of 22 nucleotides. The expression of miRNAs changes significantly in cancer, causing protein alterations in cancer cells by preventing some genes from being translated into proteins. In this research, a structural analysis of 587 miRNAs that are differentially expressed in myeloid cancer was carried out. Length distribution studies revealed a mean and median of 22 nucleotides, with an average of 21.69 and a variance of 1.65. We performed nucleotide analysis for each position where Uracil was the most observed nucleotide and Adenine the least observed one with 27.8% and 22.6%, respectively. There was a higher frequency of Adenine at the beginning of the sequences when compared to Uracil, which was more frequent at the end of miRNA sequences. The purine content of each implicated miRNA was also assessed. A novel motif analysis script was written to detect the most frequent 3–7 nucleotide (3–7n) long motifs in the miRNA dataset. We detected CUG (42%) as the most frequent 3n motif, CUGC (15%) as a 4n motif, AGUGC (6%) as a 5n motif, AAGUGC (4%) as a 6n motif, and UUUAGAG (4%) as a 7n motif. Thus, in the second part of our study, we further characterized the motifs by analyzing whether these motifs align at certain consensus sequences in our miRNA dataset, whether certain motifs target the same genes, and whether these motifs are conserved within other species. This thorough structural study of miRNA sequences provides a novel strategy to study the implications of miRNAs in health and disease. A better understanding of miRNA structure is crucial to developing therapeutic settings.

Graphical Abstract

1. Introduction

MicroRNAs (miRNAs) are single-stranded non-coding RNAs made up of short nucleotide sequences with lengths varying between 17 and 27 nucleotides, the vast majority being 20–21 nucleotides long [1,2]. Although miRNAs are relatively short sequences, they are effective enough to function as gene handbrakes and prevent long transcripts from being translated into proteins. MiRNAs interact with specific parts of a transcript by base-pairing [3].
MiRNAs play a crucial role in the maintenance of the homeostasis of important metabolic pathways and processes [4]. The alteration of miRNA levels is associated with cancer and developmental biology [5]. An up- or downregulation of miRNA expression serves as an effective reason for the development and spread of cancer [6]. The changing amount of protein in a cell by miRNA regulation affects the molecular function and harmony of the cell [7]. While the amount of intracellular protein is directly related to the expression of genes, it can also be indirectly affected by miRNA expression by inactivating the target genes before being translated into proteins [8]. This problem leads to less effective protein synthesis than the required amount in cancer cells and this in turn drives them to act according to cancers’ constitution [9]. In a cancer cell, while the gene expression is directly altered via mutations or methylations, it is indirectly inactivated with the expression of miRNAs [10]. Research in fields other than cancer, such as cell developmental biology, stem cell, and cardiovascular research has also shown that the cell’s miRNA expression affects the deactivation of some biological mechanisms [11,12]. The impact of such miRNA-based control on gene expression makes miRNAs one of the important epigenetic factors that could be used effectively as therapeutic targets in translational research [13,14].
Although miRNA genes are very short compared to genes coding for proteins, they are transcribed like genes and fulfill their functions using complementary base pairing. They function as a part of the ribonucleoprotein complex RISC (RNA-induced silencing complex) and by binding to the complementary target, they potentiate the action of the RISC [15]. The nucleotides at positions 2 to 8 near the 5′ end of miRNAs are predominantly binding sites of miRNAs and are called seed regions [16]. In some forms of binding, seed complementarity is not enough in itself and requires pairing in the central or end region of the miRNAs [17]. In most cases, miRNAs interact with the 3′ untranslated regions (3′ UTR) of target mRNAs to induce mRNA degradation and translational repression [2]. This striking positional affinity has led to the development of miRNA target search algorithms that focus on 3′UTRs for further amplification of the bias for functional 3′UTR sites [18]. However, effective miRNA-binding sites have also been identified in the 5′UTR or the open reading frame (ORF) of target mRNAs [19,20,21,22]. Different studies and computer tools that measure and reveal the miRNA-gene relationship are expressed in different ways [23]. In this work, our first question was to find the miRNAs with similar nucleotide sequences and the second one was if the extent of the similarity would give us information about the consensus miRNAs and their target proteins [24].

2. Materials and Methods

The main purpose of the research was to analyze the nucleotide sequences and specific motifs of microRNAs implicated in myeloid cancer. To achieve this, the miRNA-seq data were collected and structured from different databases for analysis, as described in the workflow chart in Figure 1.
We used GDC Data Portal to find the miRNAs that are most frequently altered in myeloid cancer (https://portal.gdc.cancer.gov/) (accessed on 25 April 2016) [25]. Specifically, the TCGA project for Acute Myeloid Leukemia (TCGA-LAML) was used under the filters transcriptome profiling, miRNA Expression Quantification, miRNA-Seq, and BCGSC miRNA Profiling. The dataset was downloaded in May 2021 and consists of the microRNA expression levels of the 197 AML patients determined by Illumina HiSeq 2000 microRNA seq. The level 3 sequencing data (expression levels of each miRNA) into the Log2 scale were used [26]. The set of isoform.quantification.txt files, which give read counts at base-pair resolution, contained the total read counts for mature miRNA (corresponding to miRBase v13 MIMAT identifiers), normalized to RPM.
Next, the nucleotide sequences of these miRNAs were retrieved manually from the miRBase database (https://www.mirbase.org/) (accessed on 1 February 2022) [27] to use in sequence and motif analysis. The nucleotide length of these miRNAs was plotted in a histogram, followed by the conduct of descriptive statistics using R Studio [28]. Next, the nucleotide frequency of each position was assessed together with purine/pyrimidine content using an Excel spreadsheet. In the second part of the project, we wrote a C++ script to analyze the miRNA sequences and identify the motifs in miRNAs in cancer (code deposited in GitHub [29]).
To identify the target genes of the miRNA that contain the consensus motifs, we first downloaded the validated target genes database from the mirTarBase database [30]. Target genes of all the sequences containing the motif of interest were taken from this database (for each motif separately). A total number of target genes (how many genes are targeted by the sequences with the motif) and the gene frequency (the number of sequences with the motif targeting the same gene) were calculated (for each motif separately).
Finally, to search for the conservation of our motifs, we downloaded the mature mRNA sequences from different species using mirbase.org. All of the sequences for selected species were used and analyzed for motifs with our program written in C++. Databases with 5n and 6n motif frequencies for each species were developed and the motif frequency was derived showing how many miRNA sequences (from one species) contain the motifs.

3. Results

The data retrieved from GDC Portal were from the miRNA profiling of patients with hematopoietic and reticuloendothelial cancer. This yielded a dataset of 587 miRNAs elevated in myeloid cancers. The sequences of these miRNAs were extracted from miRBase, a sample of data is given in Table 1.

3.1. miRNA Sequence Length and Nucleotide Frequency Analysis

We analyzed the sequence length of the miRNAs and plotted the frequency as a histogram, together with the descriptive statistics as a boxplot on top (Figure 2).
Most miRNAs in myeloid cancer were found to be 21, 22, and 23 nucleotides in length, with percentages of 19.8%, 49.8%, and 13.5%, respectively. We calculated a mode and median of 22 nucleotides, with an average of 21.69 and a variance of 1.65. The shortest miRNAs consist of 17 nucleotides (namely, hsa-mir-1260, hsa-mir-1825, hsa-mir-1207, hsa-mir-453, hsa-mir-1268, hsa-mir-1306, hsa-mir-1321, and hsa-mir-1827) and the four longest miRNAs consist of 26 and 27 nucleotides (hsa-mir-1248, hsa-mir-1183, hsa-mir-1272, and hsa-mir-1244).
It is not yet known whether the nucleotide length of miRNAs plays an active role in cancer or other diseases. Perfect base-pairing leads to the degradation of mRNA (a mechanism mainly seen in plants) and imperfect base-pairing with the target mRNA leads to repression of translation [31]. In this line of argumentation, it could be assumed that the longer the miRNA sequence, the higher the probability to complement the target mRNAs. However, this needs experimental proof. Based on the volume of literature published for each miRNA in miRBase, we noted a higher volume of research carried out on short miRNAs (consisting of 17 and 18 nucleotides) when compared to the ones with 26 and 27 nucleotides (Table 2).
Next, the percentage of nucleotides in each position was quantified (Figure 3). The analysis was done from the first (5′) to the last (3′) position of the miRNA’s nucleotides. Overall, Uracil was the most observed nucleotide, and Adenine was the least observed one with 27.8%, and 22.6%, respectively.
In the first nucleotide position, 183 miRNAs have an A and 186 miRNAs start with a U. There is a higher frequency of Adenines at the beginning of the sequences when compared to Uracil which is more frequent at the end of miRNA sequences. In particular, there is a high density of Uracils between nucleotide positions 22 and 25. Purine (A and G) and pyrimidine (C and U) nucleotide bases were analyzed for their frequency in the studied miRNA structures (Supplementary Table S2). The highest and lowest purine-scoring miRNAs are listed in Table 3. hsa-mir-765 (85.71% purines), hsa-mir-1468 (81.82%), and hsa-mir-1910 (80.00%) are some of the very high purine content miRNAs. The lowest purine content is present in hsa-mir-1281 (5.88%) and hsa-mir-483 (9.52%).

3.2. Motifs in microRNA Sequences Implicated in Myeloid Cancer

In addition to miRNA structure analysis, their common motifs were determined according to their length and frequencies. To find the most abundant motifs in miRNA sequences, we searched for nucleotide motifs containing 3, 4, 5, 6, and 7 nucleotides shared among all miRNA sequences. For this, the smallest motif encoding an amino acid, 3-nucleotide (3n), was searched. Then, the same analysis was done in the form of 4n, 5n, 6n, and 7n.

3.2.1. 3n miRNA Motifs

In the first round of motif search, we analyzed the sequences for 3n motifs, which are the smallest significant motifs. The most observed and the shortest motifs in the miRNA sequences were CUG, UGC, UGG, UGU, CAG, UUG, CCU, CUU, GUG, AGG, UCU, GCU, CGU, CGC, GCG, UGC, ACG, and CGA. These were found in 91.65% of miRNAs. Only 8.35% did not have any of these 3n motifs in their sequences (for example, hsa-mir-122 and hsa-mir-1181) (Table 4).

3.2.2. 4n miRNA Motifs

We divided the 4n motifs identified into two groups, the ones occurring in more than 75 miRNAs (the most detected) and the ones occurring in less than 10 miRNAs (the least detected) (Table 5).
CUGC, ACUG, and UGCA were found as the most detected motifs in 87 and 85 different miRNAs, respectively. On the other hand, CGAA, CGAG, CGUA, and UCGA were the least detected 4n motifs in 10 different miRNA sequences. Moreover, 112 sequences of 587 microRNAs (19%) do not have any of the top 4n motifs. In addition, 28 of 112 sequences have the least common motifs, and 84 of the miRNAs do not have any of the listed motifs (Supplementary Table S3).

3.2.3. Longer Motifs

The purpose of finding longer motifs such as 5n, 6n, and 7n new motifs was to find potentially conserved or master sites in miRNAs. A total of 271 different 5n motifs were detected (Supplementary Table S3). AGUGC was the most frequent 5n motif found in 36 miRNAs (6%). A total of 38 different 5n motifs were unique (Table 6). The other mostly detected long motifs are made of 6n and 7n sequences. The most frequently observed 6n motifs were AAGUGC and GCUUCC (detected in 22 different miRNAs, 4%), while UUUAGAG was the most detected 7n motif in our dataset (found in 19 miRNAs, 3%). Finally, the longest motifs were detected, 8n (AAGUGCUU), 9n (AAGUGCUUC), 10n (AAAGUGCUUC), and 20n (AAAGUGCUUCCCUUUAGAGU). hsa-mir-106a, hsa-mir-302a, b, c, d, e, hsa-mir-526b have the 8n, 9n, and 10n motifs, but hsa-mir-520a, b, c, d, e, g, and h include all the long motifs in their structures (Supplementary Table S3).

3.2.4. Consensus miRNA Sequences Having Many Motifs

Consensus motifs were analyzed in the miRNA sequences, elucidating the consecutive alignment of our motifs in different miRNA sequences to different degrees. In this way, detailed results were obtained about where the identified motifs are located in the miRNA, and how they appear in high-consensus sequences (Table 7).
The results of this analysis show that a miRNA can be associated with one or more mRNA targets, using the common motifs it has in the sequence. Apart from the importance of motifs and consensus sequences in the miRNA binding on their target, the secondary importance of our results may arise in these sequences being a target of RNA binding proteins (RBPs), which recognize specific sequence motifs and are key factors to regulate the miRNA function. Although the transcription factors and epigenetic modifications control the synthesis of miRNAs, their regulation after synthesis is highly controlled with RBPs [32]. Overall studies regarding the RBP binding and regulation of miRNAs are insufficient. Among more than 500 identified human RBPs, only a few have been characterized in terms of functioning in oncogene and tumor suppressor mRNAs [33]. There are many secrets to be revealed behind the miRNA processing by RBPs in healthy and disease states for research to be carried out in the future. The complexity of regulation is further increased with the clues on the cooperative work of miRNAs and RBPs in controlling common mRNA targets [34]. Taking all these into account, a detailed study of the structure of these short RNA molecules, which can perform so many functions, is crucial, and the results presented in our study, can serve as a starting point and raw material for these studies, especially in cancer models.

3.2.5. Target Genes of 7n Motifs

We next analyzed the target genes of the miRNAs that share common motifs, if they give hints on the functional aspects of the motifs we identified in myeloid cancer. For this study, 7n motifs were selected as they are longer and can be more specific in their targets [1]. Using the miRNA-target prediction tool MirTar database, we identified the targets of our miRNAs, which are experimentally validated in different studies. The list of overlapping genes is listed in Table 8, and all the detected targets are given in Supplementary Table S4.
Our 7n motif GUGCUUC is present in 15 different miRNA sequences and all of them target the same six genes (EIF2S1, SPRED1, HIP1, YOD1, ELK4, ABHD15). We see that this is the case for many motifs in different degrees. This presumes that the motifs which are identified are an important factor for target recognition and small changes in the sequences can impact the specificity of binding/regulation.

3.2.6. Conserved 5n and 6n Motifs

We further wanted to test our motif-finding script in analyzing the conserved miRNA motifs in different species. For this, 5n and 6n motifs were searched in the available miRNA sequences from different species; 645 miRNAs for Pongo pygmaeus, 1978 for Mus musculus, 437 for Caenorhabditis elegans, 690 for Equus caballus, 469 for Drosophila melanogaster, 600 for Picea abies, 695 for Oreochromis niloticus, 1138 for Monodelphis domestica, 1232 for Gallus gallus, 548 for Ciona intestinalis, and 2654 for humans (Supplementary Table S5). Among the top 15 identified motifs, we combined the ones that were common to all species and derived their percentages for specific species studied (Table 9).
MiRNAs are key regulators of many cellular processes and may be one of the main players in post-transcriptional regulation. Because they also influence vital biological processes, they tend to be conserved between species. However, there have been contradictory reports on the SNP density of these regions when compared to control [35]. Here, we show that the conservation may happen with certain motifs inside the miRNAs and the higher SNP density may be present in other parts of the miRNA, which add to the list of target genes of miRNA without disturbing the main target interactions.

4. Discussion

More than 50% of human genes are predicted to be regulated by miRNAs [36]. They are powerful post-transcriptional modulators of mRNA translation that are proven to regulate many important processes in cancer progression as well [37]. There are more than 70 disease studies that associate with miRNAs [38]. Some of them target the oncogene products and the others the tumor suppressor gene products [39]. Acute myeloid leukemia is a disease characterized by the buildup of immature myeloid cells, mainly resulting from the genetic background. However, the emerging field of miRNA research has already identified certain miRNA profiles behind the disease that correlate with prognosis [40]. Such studies were generally concentrated on the functional effects of specific miRNAs. In our study, we have a general look at the structural aspects shared by miRNAs elevated in AML patients. We addressed different aspects of the structure of the miRNAs up- or downregulated in myeloid cancer patients.
The first aspect we looked at was the length distribution of miRNAs studied, which had a mode and median of 22 nucleotides, the same as found by a previous study done on overall human miRNAs [1]. In the same study, it was shown that the distribution does not follow a Poisson distribution where the mean and variance would be equal, but rather a Laplace distribution fits better. Next, our analysis of the nucleotide distribution in every position implied the existence of structural patterns in the miRNA sequences. There is a higher occupancy of Adenines at the beginning of the sequences and of Uracil at the end of the sequences. This multinomial distribution of nucleotides in different positions was also noted as significant in the work of Fang et al. for the overall miRNAs studied, except that they did not find a significant difference between the GC and AU content of their samples. In our miRNA dataset of myeloid cancer patients, there was a grouping of miRNAs based on their purine and pyrimidine content, which further supports the pattern presence in their sequences.
MiRNAs can have many mRNA targets due to their ability to exert their function even in imperfect base-pairing. Fang et al. found a positive correlation between the average miRNA length and the number of targets, which may be explained by the higher affinity of longer miRNAs with their targets [1]. In our study, we elucidated a way to look into the motifs that target the same genes. Further research should follow for the functional aspects of this way of the importance of such targeting by many miRNAs using the same motifs on the same genes.
Some miRNAs were shown to have conserved functions beginning from mosses and ferns [41]. In a study by Vazquez et al., longer miRNAs were shown to have a more recent history in Arabidopsis. They also found a correlation between the bases at certain miRNA sites to be conserved [42]. In another study by Lewis et al., they noted that the nucleotide position upstream and downstream of seed regions of miRNAs were highly conserved [43]. In our research, we have shown that the conservation of certain motifs is present between different species to different degrees.
Overall, our research may serve as a strategy to study the common structural aspects of miRNAs in different human diseases. Furthermore, it can be extended to study the functional outcomes of the presence of motifs and more cancer types to produce an inclusive comparative study.

5. Conclusions

In conclusion, this research reveals motif sequences of miRNAs implicated in myeloid cancer, which were also shown to be clustered in consensus sequences. Moreover, it was shown that many of these motifs tend to be conserved across species.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/genes13071152/s1. Table S1: List of all identified miRNAs implicated in myeloid cancer; Table S2: Purine and py-rimidine analysis of all miRNAs; Table S3: Motif analysis results of miRNAs studied; Table S4: Target gene analysis for all miRNAs; Table S5: Conserved motif analysis among species.

Author Contributions

Conceptualization, methodology, writing—original draft preparation, S.D.; investigation, data analysis, software, A.C.; data analysis, visualization, writing—review and editing, E.S. All authors have read and agreed to the published version of the manuscript.

Funding

The APC was funded by Leipzig University and Soft Mater Physics Division.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is available online (https://doi.org/10.6084/m9.figshare.20152466) or upon request from the corresponding author.

Acknowledgments

S.D. thanks Josef Kas, Bahriye Aktas, lab members of Soft Matter Physics in Leipzig University, and Jörg Schnauß for support. S.D. also acknowledges the support from the German Research Foundation (DFG) Project number KA 1116/22-1.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Fang, Z.; Du, R.; Edwards, A.; Flemington, E.K.; Zhang, K. The Sequence Structures of Human MicroRNA Molecules and Their Implications. PLoS ONE 2013, 8, e54215. [Google Scholar] [CrossRef] [PubMed]
  2. O’Brien, J.; Hayder, H.; Zayed, Y.; Peng, C. Overview of MicroRNA Biogenesis, Mechanisms of Actions, and Circulation. Front. Endocrinol. 2018, 9, 402. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Lin, S.-L.; Chang, D.C.; Ying, S.-Y.; Shao-Yao, Y. Isolation and Identification of Gene-Specific MicroRNAs. Methods Mol. Biol. 2006, 342, 313–320. [Google Scholar] [CrossRef]
  4. Nelson, M.C.; O’Connell, R.M. MicroRNAs: At the Interface of Metabolic Pathways and Inflammatory Responses by Macrophages. Front. Immunol. 2020, 11, 1797. [Google Scholar] [CrossRef] [PubMed]
  5. Costa, C.; Teodoro, M.; Rugolo, C.A.; Alibrando, C.; Giambò, F.; Briguglio, G.; Fenga, C. MicroRNAs alteration as early biomarkers for cancer and neurodegenerative diseases: New challenges in pesticides exposure. Toxicol. Rep. 2020, 7, 759–767. [Google Scholar] [CrossRef]
  6. Xu, P.; Wu, Q.; Yu, J.; Rao, Y.; Kou, Z.; Fang, G.; Shi, X.; Liu, W.; Han, H. A Systematic Way to Infer the Regulation Relations of miRNAs on Target Genes and Critical miRNAs in Cancers. Front. Genet. 2020, 11, 278. [Google Scholar] [CrossRef] [PubMed]
  7. Li, M.; Marin-Muller, C.; Bharadwaj, U.; Chow, K.-H.; Yao, Q.; Chen, C. MicroRNAs: Control and Loss of Control in Human Physiology and Disease. World J. Surg. 2009, 33, 667–684. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  8. Haralambieva, I.H.; Kennedy, R.B.; Simon, W.L.; Goergen, K.M.; Grill, D.E.; Ovsyannikova, I.G.; Poland, G.A. Differential miRNA expression in B cells is associated with inter-individual differences in humoral immune response to measles vaccination. PLoS ONE 2018, 13, e0191812. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  9. Catalanotto, C.; Cogoni, C.; Zardo, G. MicroRNA in Control of Gene Expression: An Overview of Nuclear Functions. Int. J. Mol. Sci. 2016, 17, 1712. [Google Scholar] [CrossRef] [Green Version]
  10. Croce, C.M. Causes and consequences of microRNA dysregulation in cancer. Nat. Rev. Genet. 2009, 10, 704–714. [Google Scholar] [CrossRef]
  11. Çakmak, H.A.; Demir, M. MicroRNA and Cardiovascular Diseases. Balk. Med. J. 2020, 37, 60–71. [Google Scholar] [CrossRef] [PubMed]
  12. Khan, A.Q.; Ahmed, E.I.; Elareer, N.R.; Junejo, K.; Steinhoff, M.; Uddin, S. Role of miRNA-Regulated Cancer Stem Cells in the Pathogenesis of Human Malignancies. Cells 2019, 8, 840. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Arif, K.M.T.; Elliott, E.K.; Haupt, L.M.; Griffiths, L.R. Regulatory Mechanisms of Epigenetic miRNA Relationships in Human Cancer and Potential as Therapeutic Targets. Cancers 2020, 12, 2922. [Google Scholar] [CrossRef] [PubMed]
  14. Brosnan, C.A.; Palmer, A.J.; Zuryn, S. Cell-type-specific profiling of loaded miRNAs from Caenorhabditis elegans reveals spatial and temporal flexibility in Argonaute loading. Nat. Commun. 2021, 12, 2194. [Google Scholar] [CrossRef] [PubMed]
  15. Pratt, A.J.; MacRae, I.J. The RNA-induced Silencing Complex: A Versatile Gene-silencing Machine. J. Biol. Chem. 2009, 284, 17897–17901. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. Chu, Y.-W.; Chang, K.-P.; Chen, C.-W.; Liang, Y.-T.; Soh, Z.T.; Hsieh, L. miRgo: Integrating various off-the-shelf tools for identification of microRNA–target interactions by heterogeneous features and a novel evaluation indicator. Sci. Rep. 2020, 10, 1466. [Google Scholar] [CrossRef] [Green Version]
  17. Gorski, S.A.; Vogel, J.; Doudna, J.A. RNA-based recognition and targeting: Sowing the seeds of specificity. Nat. Rev. Mol. Cell Biol. 2017, 18, 215–228. [Google Scholar] [CrossRef]
  18. Bartel, D.P. MicroRNAs: Target Recognition and Regulatory Functions. Cell 2009, 136, 215–233. [Google Scholar] [CrossRef] [Green Version]
  19. Duursma, A.M.; Kedde, M.; Schrier, M.; le Sage, C.; Agami, R. miR-148 targets human DNMT3b protein coding region. RNA 2008, 14, 872–877. [Google Scholar] [CrossRef] [Green Version]
  20. Forman, J.J.; Legesse-Miller, A.; Coller, H.A. A search for conserved sequences in coding regions reveals that the let-7 microRNA targets Dicer within its coding sequence. Proc. Natl. Acad. Sci. USA 2008, 105, 14879–14884. [Google Scholar] [CrossRef] [Green Version]
  21. Henke, J.I.; Goergen, D.; Zheng, J.; Song, Y.; Schüttler, C.G.; Fehr, C.; Jünemann, C.; Niepmann, M. microRNA-122 stimulates translation of hepatitis C virus RNA. EMBO J. 2008, 27, 3300–3310. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  22. Jopling, C.L.; Yi, M.; Lancaster, A.M.; Lemon, S.M.; Sarnow, P. Modulation of Hepatitis C Virus RNA Abundance by a Liver-Specific MicroRNA. Science 2005, 309, 1577–1581. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Madden, S.F.; Carpenter, S.B.; Jeffery, I.B.; Björkbacka, H.; Fitzgerald, K.A.; O’Neill, L.A.; Higgins, D.G. Detecting microRNA activity from gene expression data. BMC Bioinform. 2010, 11, 257. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Holland, B.; Wong, J.; Li, M.; Rasheed, S. Identification of Human MicroRNA-Like Sequences Embedded within the Protein-Encoding Genes of the Human Immunodeficiency Virus. PLoS ONE 2013, 8, e58586. [Google Scholar] [CrossRef] [Green Version]
  25. Grossman, R.L.; Heath, A.P.; Ferretti, V.; Varmus, H.E.; Lowy, D.R.; Kibbe, W.A.; Staudt, L.M. Toward a Shared Vision for Cancer Genomic Data. N. Engl. J. Med. 2016, 375, 1109–1112. [Google Scholar] [CrossRef]
  26. Chuang, M.-K.; Chiu, Y.-C.; Chou, W.-C.; Hou, H.-A.; Chuang, E.Y.; Tien, H.-F. A 3-microRNA scoring system for prognostication in de novo acute myeloid leukemia patients. Leukemia 2015, 29, 1051–1059. [Google Scholar] [CrossRef]
  27. Kozomara, A.; Birgaoanu, M.; Griffiths-Jones, S. miRBase: From microRNA sequences to function. Nucleic Acids Res. 2019, 47, D155–D162. [Google Scholar] [CrossRef]
  28. RStudio Team RStudio. Integrated Development Environment for R; RStudio, PBC: Boston, MA, USA, 2020. [Google Scholar]
  29. Cilic, A. Motifs. 2022. Available online: https://github.com/aniscilic/motifs (accessed on 14 May 2022).
  30. Huang, H.-Y.; Lin, Y.-C.-D.; Li, J.; Huang, K.-Y.; Shrestha, S.; Hong, H.-C.; Tang, Y.; Chen, Y.-G.; Jin, C.-N.; Yu, Y.; et al. miRTarBase 2020: Updates to the experimentally validated microRNA–target interaction database. Nucleic Acids Res. 2020, 48, D148–D154. [Google Scholar] [CrossRef] [Green Version]
  31. Gebert, L.F.R.; Macrae, I.J. Regulation of microRNA function in animals. Nat. Rev. Mol. Cell Biol. 2019, 20, 21–37. [Google Scholar] [CrossRef]
  32. Van Kouwenhove, M.; Kedde, M.; Agami, R. MicroRNA regulation by RNA-binding proteins and its implications for cancer. Nat. Rev. Cancer 2011, 11, 644–656. [Google Scholar] [CrossRef]
  33. Lukong, K.E.; Chang, K.-W.; Khandjian, E.W.; Richard, S. RNA-binding proteins in human genetic disease. Trends Genet. 2008, 24, 416–425. [Google Scholar] [CrossRef] [PubMed]
  34. Ciafre’, S.A.; Galardi, S. microRNAs and RNA-binding proteins. RNA Biol. 2013, 10, 934–942. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Zhang, Q.; Lu, M.; Cui, Q. SNP analysis reveals an evolutionary acceleration of the human-specific microRNAs. Nat. Prece. 2008. [Google Scholar] [CrossRef]
  36. Friedman, R.C.; Farh, K.K.-H.; Burge, C.B.; Bartel, D.P. Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 2009, 19, 92–105. [Google Scholar] [CrossRef] [Green Version]
  37. Calin, G.A.; Sevignani, C.; Dumitru, C.D.; Hyslop, T.; Noch, E.; Yendamuri, S.; Shimizu, M.; Rattan, S.; Bullrich, F.; Negrini, M.; et al. Human microRNA genes are frequently located at fragile sites and genomic regions involved in cancers. Proc. Natl. Acad. Sci. USA 2004, 101, 2999–3004. [Google Scholar] [CrossRef] [Green Version]
  38. Lu, M.; Zhang, Q.; Deng, M.; Miao, J.; Guo, Y.; Gao, W.; Cui, Q. An Analysis of Human MicroRNA and Disease Associations. PLoS ONE 2008, 3, e3420. [Google Scholar] [CrossRef] [Green Version]
  39. Calin, G.; Liu, C.-G.; Ferracin, M.; Hyslop, T.; Spizzo, R.; Sevignani, C.; Fabbri, M.; Cimmino, A.; Lee, E.J.; Wojcik, S.E.; et al. Ultraconserved Regions Encoding ncRNAs Are Altered in Human Leukemias and Carcinomas. Cancer Cell 2007, 12, 215–229. [Google Scholar] [CrossRef] [PubMed]
  40. Wallace, J.A.; O’Connell, R.M. MicroRNAs and acute myeloid leukemia: Therapeutic implications and emerging concepts. Blood 2017, 130, 1290–1301. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  41. Vazquez, F.; Blevins, T.; Ailhas, J.; Boller, T.; Meins, F., Jr. Evolution of Arabidopsis MIR genes generates novel microRNA classes. Nucleic Acids Res. 2008, 36, 6429–6438. [Google Scholar] [CrossRef] [Green Version]
  42. Shi, B.; Gao, W.; Wang, J. Sequence Fingerprints of MicroRNA Conservation. PLoS ONE 2012, 7, e48256. [Google Scholar] [CrossRef]
  43. Lewis, B.P.; Burge, C.B.; Bartel, D.P. Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets. Cell 2005, 120, 15–20. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. Flowchart of data collection and processing. In brackets is the database/tool used to perform the analysis.
Figure 1. Flowchart of data collection and processing. In brackets is the database/tool used to perform the analysis.
Genes 13 01152 g001
Figure 2. The distribution of miRNA length in myeloid cancer.
Figure 2. The distribution of miRNA length in myeloid cancer.
Genes 13 01152 g002
Figure 3. Percentage of nucleotides in each position of miRNAs studied. Position 1–27 corresponds to direction 5′-3′.
Figure 3. Percentage of nucleotides in each position of miRNAs studied. Position 1–27 corresponds to direction 5′-3′.
Genes 13 01152 g003
Table 1. Nucleotide sequences of miRNAs implicated in myeloid cancer 1.
Table 1. Nucleotide sequences of miRNAs implicated in myeloid cancer 1.
5′Direction3′
123456789101112131415161718192021222324252627Length
hsa-mir-1248ACCUUCUUGUAUAAGCACUGUGCUAAA27
hsa-mir-1183CACUGUAGGUGAUGGUGAGAGUGGGCA27
hsa-mir-1272GAUGAUGAUGGCAGCAAAUUCUGAAA 26
hsa-mir-1244AAGUAGUUGGUUUGUAUGAGAUGGUU 26
hsa-mir-921CUAGUGAGGGACAGAACCAGGAUUC 25
hsa-mir-638AGGGAUCGCGGGCGGGUGGCGGCCU 25
hsa-mir-1279UCAUAUUGCUUCUUUCU 17
1 Position 1–27 corresponds to direction 5′-3′. The complete data can be found in Supplementary Table S1.
Table 2. Comparison of short and long miRNA publication numbers. Retrieved in February 2022 from reference counts of each miRNA page in mirbase.org.
Table 2. Comparison of short and long miRNA publication numbers. Retrieved in February 2022 from reference counts of each miRNA page in mirbase.org.
Short miRNAsNucleotide LengthPublications
hsa-mir-12751733
hsa-mir-302e17160
hsa-mir-12071833
Long miRNAsNucleotide LengthPublications
hsa-mir-12482716
hsa-mir-1183275
hsa-mir-1272265
hsa-mir-1244267
Table 3. Purine rich and poor miRNAs.
Table 3. Purine rich and poor miRNAs.
miRNAHighest Purine ContentAGCU%A + G
hsa-mir-765UGGAGGAGAAGGAAGGUGAUG7110385.71
hsa-mir-1468AGCAAAAUAAGCAAAUGGAAAA1442281.82
hsa-mir-1910GAGGCAGAAGCAGGAUGACA883180.00
hsa-mir-202AGAGGUAUAGGGCAUGGGAA791380.00
hsa-mir-1255aAGGAUGAGCAAAGAAAGUAGAUU1171478.26
hsa-mir-320aAAAAGCUGGGUUGAGAGGGCGA7102377.27
hsa-mir-936ACAGUAGAGGGAGGAAUCGCAG893277.27
hsa-mir-149AGGGAGGGACGGGGGCUGUGC3133276.19
Lowest purine content
hsa-mir-1281UCGCCUCCUCCUCUCCC011155.88
hsa-mir-483UCACUCCUCUCCUCCCGUCUU111189.52
hsa-mir-877UCCUCUUCUCCCUCCUCCCAG111279.52
hsa-mir-1236CCUCUUCCCCUUGUCUCUCCAG1211813.64
hsa-mir-1249ACGCCCUUCCCCCCCUUCUUCA2113613.64
hsa-mir-1224CCCCACCUCCUCUCUCCUCAG2113514.29
hsa-mir-1238CUUCCUCGUCUGUCUGCCCC0310715.00
Table 4. The list of identified 3n motifs in studied myeloid cancer miRNA dataset.
Table 4. The list of identified 3n motifs in studied myeloid cancer miRNA dataset.
3n MotifFrequencyPercentage
CUG24942.42%
UGC23439.86%
UGG23339.69%
UGU23139.35%
CAG21336.29%
UUG21336.29%
CCU20534.92%
CUU20534.92%
GUG20234.41%
AGG19633.39%
UCU19533.22%
GCU19132.54%
CGU7412.61%
CGC7212.27%
GCG6611.24%
UCG6210.56%
ACG569.54%
CGA427.16%
Table 5. The number of most and least detected 4n motifs.
Table 5. The number of most and least detected 4n motifs.
Most Detected 4n MotifsLeast Detected 4n Motifs
Present in >70 miRNAsPresent in <10 miRNAs
MotifFrequencyPercentageMotifFrequencyPercentage
CUGC8714.82%CGAA101.70%
ACUG8514.48%CGAG101.70%
UGCA8514.48%CGUA101.70%
CUUU8314.14%UCGA101.70%
AGUG8213.97%ACGA91.53%
CUGG8013.63%ACGC91.53%
CUGU8013.63%UACG81.36%
UUUG7913.46%CGAU71.19%
CAGU7813.29%UUCG61.02%
UUCU7813.29%
UGCU7713.12%
UGUG7713.12%
GUGC7612.95%
UGGG7612.95%
UCUG7512.78%
Table 6. The highly observed long motifs.
Table 6. The highly observed long motifs.
7n MotifFrequency6n MotifFrequency5n MotifFrequency4n MotifFrequency
UUUAGAG19AAGUGC22AGUGC36CUGC87
AAGUGCU18GCUUCC22CUUCC34ACUG85
AGUGCUU16UUUAGA21GCUUC33UGCA85
GUGCUUC15UUAGAG20AAGUG32CUUU83
UGCUUCC15UGCUUC19CCUUU32AGUG82
AGUGCU18CUGCC31CUGG80
Table 7. Motifs and consensus sequences in miRNAs.
Table 7. Motifs and consensus sequences in miRNAs.
miRNA5′-3′3n4n5n6n7n
519b
520c
526b
Genes 13 01152 i001GUGGAGGAGUGCAAAGUGUUUAGAG
UGCCUUUCUUUUAAGUGCAAAGUGC
CCUGUGCAAGUGUUUAGAUCCUUUU
CUUUAGAAGAGCUUAGAGUUAGAGG
Table 8. MiRNA with 7n motifs and their gene targets.
Table 8. MiRNA with 7n motifs and their gene targets.
7n MotifFrequency Targeted GeneFrequencyPercentage
GUGCUUC15EIF2S115100
GUGCUUC15SPRED115100
GUGCUUC15HIP115100
GUGCUUC15YOD115100
GUGCUUC15ELK415100
GUGCUUC15ABHD1515100
UGCUUCC15ACOX11493.33
UGCUUCC15EIF2S11386.67
UGCUUCC15HOOK31386.67
UGCUUCC15SPRED11386.67
UGCUUCC15HIP11386.67
UGCUUCC15YOD11386.67
AGUGCUU16PNRC11593.75
AGUGCUU16DNAJC101593.75
AGUGCUU16HMGB11593.75
AGUGCUU16MED181593.75
AGUGCUU16MASTL1593.75
AGUGCUU16DSTYK1593.75
AAGUGCU18HMGB11794.44
AAGUGCU18PNRC11688.89
AAGUGCU18DNAJC101688.89
AAGUGCU18MED181688.89
AAGUGCU18MASTL1688.89
AAGUGCU18DSTYK1688.89
UUUAGAG19YOD1842.11
UUUAGAG20ABHD15840.00
UUUAGAG21DNAJC28838.10
UUUAGAG22SAMD8836.36
UUUAGAG23PRRG4834.78
UUUAGAG24GPR157833.33
Table 9. 5n and 6n motifs conserved between humans and other species.
Table 9. 5n and 6n motifs conserved between humans and other species.
5n Motifs6n Motifs
CAGUGCUGGGUGCAGUUUUCUUGCAGAGAGAGUGCUAGUGCACUGCAG
Pongo pygmaeus5.4%4.8%4.8% 3.1%2.9%
Mus musculus5.2%5.7%5.0% 2.1%
Equus caballus6.2%6.4%5.8% 2.0%2.3%2.3%
Oreochromis niloticus6.6% 6.9% 2.9%3.2%
Monodelphis domestica4.5% 2.0%1.8%
Gallus gallus5.8% 7.2% 2.4% 3.0%
Human 6.0%5.2% 2.2%
Caenorhabditis elegans 7.3%
Drosophila melanogaster 8.3%
Picea abies 10.7%
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Dogan, S.; Spahiu, E.; Cilic, A. Structural Analysis of microRNAs in Myeloid Cancer Reveals Consensus Motifs. Genes 2022, 13, 1152. https://doi.org/10.3390/genes13071152

AMA Style

Dogan S, Spahiu E, Cilic A. Structural Analysis of microRNAs in Myeloid Cancer Reveals Consensus Motifs. Genes. 2022; 13(7):1152. https://doi.org/10.3390/genes13071152

Chicago/Turabian Style

Dogan, Senol, Emrulla Spahiu, and Anis Cilic. 2022. "Structural Analysis of microRNAs in Myeloid Cancer Reveals Consensus Motifs" Genes 13, no. 7: 1152. https://doi.org/10.3390/genes13071152

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop