2.1. DMS-Seq Data Suggest Unfolding of MALAT1 Structure in K562 Cells
DMS-Seq involves chemically labeling RNA with DMS on unstructured adenosine and cytidine residues, which stops reverse transcriptase in a manner that can be detected by sequencing [
27]. K562 DMS-Seq data [
27] for MALAT1 (herein referred to as K562-MALAT1) were first analyzed to determine which adenosine and cytidine nucleotides are unstructured or structured (
Supplemental Table S1). Of the 8425 nucleotides within the human MALAT1 transcript that were examined, 3951 nucleotides are either adenosine or cytidine. Of these, DMS-Seq data determined using MALAT1 isolated from K562 cells were available for 2554 adenosine and cytidine nucleotides, whereby 1835 DMS-Seq (71.8%) datapoints (i.e., number of DMS-Seq reads corresponding to a single adenosine or cytidine) were classified as unstructured and 719 DMS-Seq datapoints (28.2%) were classified as structured pursuant to the 250-read threshold (see
Section 3 and
Figure 1A). When K562-MALAT1 was compared to the working noncancerous MALAT1 model (
Supplemental Table S1), 1504 datapoints (58.9%) agreed with the consensus model while 733 datapoints (28.7%) corresponded to loss of structure and 317 datapoints (12.4%) corresponded to gain of structure (
Figure 1B). While the majority of K562-MALAT1 agreed with the MALAT1 consensus model, 41.1% of datapoints diverged from the noncancerous model, suggesting that wide-sweeping changes in MALAT1 secondary structure may occur in K562 cells.
K562-MALAT1 data that corresponded to the loss or gain of structure were mapped onto the noncancerous MALAT1 model to examine how the secondary structure of MALAT1 may vary in K562 cells (
Figure 2 and
Supplemental Figure S1). Of the 194 hairpins predicted in the noncancerous MALAT1 model, 101 hairpins from the noncancerous model are supported by DMS-Seq data and 59 of these hairpins (58.4%) lose at least half of their base pairs in K562 cells. Among these hairpins are: H44, losing 13 of 25 base pairs (52.0%); H45, losing 5 of 8 base pairs (62.5%); H98, losing 8 of 12 base pairs (66.7%); and H155, losing 4 of 6 base pairs (66.7%) (
Figure 3, “K562-MALAT1”). These correspond to novel losses of eight base pairs in H44, four base pairs in H45, six base pairs in H98, and two base pairs in H155 (
Figure 3, “Change”). The comparative analysis suggests widespread loss of structural features within the context of K562-MALAT1. While individual nucleotides occasionally appear to gain structure in K562-MALAT1, the sporadic occurrences do not suggest the development of any unambiguous novel secondary structures in MALAT1 in K562 cells (
Figure 2). It is worth noting that refolding of the MALAT1 structure based on the K562 DMS-Seq data is expected to produce a novel secondary structure of K562-MALAT1. However, DMS-Seq is the only major RNA structural probing dataset available for K562 cells and it lacks data for about 1600 nts from a central region of MALAT1; therefore, a novel model cannot be constructed and the analysis herein is restricted to identifying regions of MALAT1 that potentially change in K562 cells. Overall, K562-MALAT1 results suggest many hairpins in the working MALAT1 model lose structure, and unstructured regions remain unstructured. This result is in agreement with prior work using DMS-Seq data, which found loss of structure in mRNAs [
27]. Cumulatively, the K562-MALAT1 data indicate general loss of structure in MALAT1, thereby suggesting possible functional ramifications within the context of K562 cells.
2.2. Predicted Secondary Structural Changes in K562-MALAT1 Would Impact Multiple RNA- and Protein-Binding Sites
Loss of secondary structure in K562-MALAT1 may signal that certain RNA- and protein-binding sites are now available in MALAT1, especially for single-stranded RNA-binding proteins. As such, this possibility was examined to identify the aberrant binding events in K562-MALAT1 that are different from binding and interaction events for MALAT1 in noncancerous conditions. miRNAs, ncRNAs, proteins, RNA modifications, SNPs, and cancer-associated mutations were re-aligned to MALAT1 to identify structure-function relationships that provide a starting point to examine their possible roles in CML (
Supplemental Table S1).
miRNAs are currently known to play a pivotal role in the development and progression of CML [
31]. Of the 98 validated miRNA-binding sites in MALAT1, 28 sites occur in hairpins predicted to lose structure in K562-MALAT1, thereby increasing accessibility of binding site and potential for sponging (
Supplemental Table S1). Examples include miR-320, which overlaps with H101; miR-217, which overlaps with H160; and miR-140-5p, which overlaps with H168 (
Figure 4). miR-320 is considered a tumor suppressor in K562 cells, but K562 cells often bypass its action by transporting miR-320 to exosomes via hnRNPA1 [
32]. Sponging of miR-320 by MALAT1 in K562 cells could also dampen the tumor-suppressive effects of miR-320, as is the case with lncRNA SNHG12 sponging miR-320 in gastric cancer [
33]. miR-217 reportedly targets the mRNA of oncogenic protein AGR2 in K562 cells [
19]. As decreases in unbound miR-217 accompany AGR2 upregulation and subsequent dasatinib resistance in K562 cells [
19], sponging of miR-217 by MALAT1 may have similar effects in K562 cells. miR-140-5p has been linked to CML cell apoptosis via targeting of the SIX1 mRNA transcript [
34]; therefore, possible sponging of miR-140-5p by MALAT1 in K562 cells may promote cell survival. These examples highlight how the novel availability of miRNA-binding sites in MALAT1 may aid in K562 cell progression via multiple miRNA-mediated mechanisms.
Eleven hairpins expected to lose structure in K562-MALAT1 (H49, H77, H79, H80, H147, H148, H155, H156, H164, H165, H168) overlap with eight of the ten U1 snRNA-binding sites (nts 1825–1925, 3015–3067, 3152–3185, 5924–6023, 6127–6277, 6850–6884, 6985–7045, and 7138–7206) (
Supplemental Table S1). U1 snRNA is known to be mutated in multiple cancer types to promote aberrant gene splicing patterns [
35]. Although the roles of U1-MALAT1 interactions have not been elucidated, it is conceivable that binding of U1 snRNA to MALAT1 may also contribute toward alternate, oncogenic splicing patterns that promote CML. Furthermore, one U1 snRNA-binding site (nts 3152–3185) overlaps with a HuR/ELAV1-binding site (nts 3158–3163) that may become accessible upon loss of H79 in K562-MALAT1. Although the effects of HuR-MALAT1 binding on CML have not been investigated, the HuR-MALAT1 RNP complex has been shown to stop breast cancer cells from undergoing epithelial-mesenchymal transitioning by decreasing the levels of CD133 [
36]. Thus, increased HuR-MALAT1 binding in K562 cells is expected to hinder cancer progression, suggesting the existence of alternate pathways by which HuR-MALAT1 binding affects K562 cells. HuR typically binds to mRNAs in cancer in order to promote cancerous functions, such as metastasis and apoptosis resistance [
37]. Thus, competition between HuR and U1 snRNA for a binding site around nt 3160 may point to a carefully controlled cancer-promoting mechanism mediated by MALAT1. In general, characterized protein-binding sites on MALAT1 are expected to become more available as a result of widespread structural loss and these changes in protein-MALAT1 binding predicted by K562 DMS-Seq data hint at novel pathways to explore further.
Besides RNA- and protein-binding sites, RNA structure can also be modulated by RNA modifications, SNPs, and cancer-associated mutations. RNA modifications on MALAT1 in K562 cells are undetermined, so modification sites from other cell lines were used. Of all the 82 m
6A modifications mapped to MALAT1 at single-nucleotide resolution in HEK293, HEK293T, and HeLa cells [
24,
38,
39,
40,
41], 61 m
6A modifications either overlap with hairpins predicted to lose structure or, if not overlapping with hairpins, correspond to adenosines predicted to be unstructured (
Supplemental Table S1). The METTL3/14 complex, which is responsible for about 80% of m
6A marks in human mRNAs and ncRNAs, has no strong preference for ssRNA or dsRNA [
42], suggesting secondary structural changes are insufficient to predict changes in m
6A levels caused by METTL3/14. It is worth noting that the METTL3/14 complex is considered tumor suppressive and is downregulated in cancers like endometrial cancer, whereas m
6A erasers like ALKBH5 and FTO are oncogenic and often upregulated in cancers like acute myeloid leukemia (AML) and breast cancer [
43]. Correspondingly, ALKBH5 is associated with MALAT1 upregulation [
44] and FTO regulates MALAT1 levels via demethylation [
45]. ALKBH5 does not discriminate between ssRNA and dsRNA [
46] and FTO targets ssRNA [
47]; therefore, m
6A marks in MALAT1 are potential substrates for both m
6A erasers. Also notable, m
6A2515 enables the binding of hnRNPG to MALAT1 at H63 [
16], which is lost in K562-MALAT1 (
Figure 2). As hnRNPG has stronger binding affinity for ssRNA, particularly in A-rich regions, [
48,
49] and has generally been associated with tumor suppressive effects [
50,
51,
52], increased MALAT1-hnRNPG binding may decrease the tumor suppressive activity of hnRNPG and may promote K562 cell progression. While the functional effects of any aberrant methylation patterns are difficult to predict in CML, m
6A modifications and their roles in RNA regulation and function, particularly with regard to mRNAs where such modifications are the most common, have been explored in attempts to develop novel cancer biomarkers and treatments [
53]. Therefore, understanding how structural alterations in MALAT1 modulate m
6A modification sites in K562 cells is of particular interest.
Seventeen SNPs have been identified in MALAT1 [
54]. rs664589 (C4117G), rs115795653 (A6415G), and rs60151940 (C7151W) are three SNPs that correspond to nucleotides that are predicted to lose structure in K562-MALAT1 (
Supplemental Table S1). SNPs in structured RNAs are generally believed to alter the local secondary structure [
55], although the severity of alterations can vary and is difficult to predict [
56]. Interestingly, SNP rs664589 has been characterized as aiding colorectal cancer progression by inhibiting MALAT1-miR-194-5p binding [
57]. Besides SNPs, 655 somatic cancer-associated mutations have been identified in MALAT1 [
58]. Fifty-nine mutations (9.0%) correspond to nucleotides predicted to lose structure in K562-MALAT1 (
Supplemental Table S1). Such mutations could further weaken the hairpin structures or reduce miRNA binding, particularly if the mutation were to disrupt base pairing in the seed region of the miRNA-binding site. Within H71, the A2875U mutation alters seed-region base pairing for miR-92a-3p, miR-363-3p, and miR-25-3p (
Figure 5 and
Supplemental Table S1). Although miR-363-3p is associated with tumor suppression in other cancer types [
59], miR-92a-3p and miR-25-3p promote progression in cancers like liposarcomas [
60]. Moreover, miR-92a-3p was previously found to aid CML by downregulating C/EBPα and subsequently causing cachexia, i.e., severe weight and muscle loss associated with cancer [
61]. As predicted previously, this proposed role of H71 in regulating MALAT1-miRNA interactions illustrates how H71 can modulate the different outputs depending on the cellular context [
15]. In total, 16 miRNA-binding sites have seed regions within hairpins predicted to become unstructured in K562-MALAT1 (
Supplemental Table S1, “SeqMarkup” tab), and may experience reduced binding affinity due to somatic cancer-associated mutations. Additionally, one METTL3/14-binding site, two HuR/ELAV-binding sites, and ten U1-binding sites face similar conditions because of mutations, pointing to complex regulatory pathways that may depend on the K562-MALAT1 structure. Together, the effects of secondary structural loss on miRNA-, U1 snRNA and HuR-, m
6A-, SNP-, and cancer-associated mutation-related effects in K562 cells represent potential avenues for further characterization of MALAT1 activity in cancer.
2.3. PARIS Data Suggest Maintenance of Overall Structure of MALAT1 with Rearrangements of Select Long-Range Interactions
PARIS involves sequencing of RNA fragments that were once photocrosslinked-duplexes isolated from psoralen-treated cells [
28]. PARIS data for MALAT1 in HeLa cells (herein referred to as HeLa-MALAT1) were compared and mapped to the noncancerous consensus model [
15] to see how the MALAT1 secondary structural model is expected to change within the context of HeLa cells (
Figure 6,
Supplemental Figure S2 and Supplemental Tables S1 and S2) [
28,
29]. Eighty unique PARIS interactions (
Supplemental Table S2) were aligned to the MALAT1 model. Of these, 18 PARIS interactions (22.5%) diverged from hairpins described in the consensus model while 62 local interactions (77.5%) agreed with the hairpins in the consensus model, suggesting that the secondary structure of MALAT1, as it is hypothesized to exist, may be largely maintained in HeLa cells (
Figure 6). The 18 PARIS interactions that diverge from the model typically suggest that secondary structural elements undergo structural rearrangement. Of the 18 divergent PARIS interactions, six interactions (33.3%) were denoted as short-range or local interactions and 12 interactions (66.7%) were denoted as long-range interactions, most of which are separated by at least 80 nucleotides in their primary structure in the working noncancerous MALAT1 model (
Supplemental Table S2) [
15]. The 12 divergent long-range PARIS interactions typically signal the structural rearrangement of multiple hairpins whereas the six short-range PARIS interactions signal the formation of novel structures. Up to 42 hairpins out of 161 hairpins (26.1%) are expected to undergo rearrangement and 119 hairpins (73.9%), which are conserved among mammals [
15], appear maintained in the HeLa-MALAT1 model (
Supplemental Tables S1 and S2). Thus, with regard to structural alterations of hairpins, rearranging long-range interactions is preferred over novel short-range interactions.
Five of the aforementioned local PARIS interactions occur in predominantly unstructured regions of the working MALAT1 consensus model (
Figure 6, dark red lines). Four of these interactions fall between nts 1897 and 1941 (i.e., between H49 and H50) and the fifth interaction falls between nts 7458 and 7461, preceding H174 (
Figure 6). The noncancerous HEK293T PARIS data did not highlight any such structures at the corresponding locations (
Supplemental Table S2). Together, these five interactions suggest distinct instances of dynamic, novel structures. In contrast, most of the 12 long-range PARIS interactions indicate distinct instances of structural rearrangement (
Figure 6, purple lines). Curiously, five of the long-range PARIS interactions and one divergent short-range PARIS interaction start within 561 nucleotides of one another, spanning nts ~4950 to ~5600 (
Figure 6 and
Supplemental Table S2). This region is largely conserved among mammals as well as some vertebrates [
15]. Thus, a core of MALAT1 undergoes structural rearrangement in HeLa cells. The 12 long-range PARIS interactions suggest rearrangement of 39 hairpins, such as H126, H134, H136, and H178. Long-range interactions suggest rearrangement of H105 (coordinates 6446,6564), which notably forms a 56-way junction, and H170 (coordinates 7631,8196), which notably forms a 20-way junction (
Figure 6, 56WJ and 20WJ). The PARIS data suggest these long-range interactions are lost in favor of structural rearrangement in HeLa cells, as opposed to general structural loss in K562-MALAT1.
In addition to hairpins, the hypothetical consensus model predicts 13 pseudoknots in noncancerous cells [
15]. As previously noted, m
6A5044 is absent in HeLa cells [
15,
24,
25,
26]. This loss of methylation may result in the loss of PK7 as there is a lack of PARIS data for PK7 (coordinates 5038, 6642) in HeLa cells, as previously reported [
15]. Instead, PARIS reads (coordinates 5038,5145) suggest formation of a local hairpin [
15], as indicated by the sixth divergent short-range PARIS interaction (
Figure 6). Additionally, long-range PARIS interactions suggest structural rearrangement of PK3 and PK9 while the structural rearrangement of PK10 and PK11 is supported by short-range interactions. PARIS data do not predict disruption of any other pseudoknots. Unlike hairpins, pseudoknots typically span long ranges of MALAT1 in the working noncancerous model [
15]. As a result, loss of many pseudoknots would indicate widespread structural changes in MALAT1. While the loss of these pseudoknots signals some propensity for long-range structural changes, the maintenance of eight pseudoknots reaffirms the trend of structural maintenance within HeLa-MALAT1. Overall, most local secondary structural features are retained in HeLa-MALAT1, with rearrangement of select long-range secondary structures and formation of a small number of novel, local structures.
2.4. Predicted Structural Changes in HeLa-MALAT1 Would Impact RNA-Binding Sites and Modifications
Structural rearrangements or novel structures detected in HeLa-MALAT1 means that the structures of RNA- and protein-binding sites underwent changes that may potentially alter their function. MALAT1 has 98 experimentally verified miRNA-binding sites (
Supplemental Table S1). Duplex formation in HeLa-MALAT1 suggests disruption of seed region-binding sites for 25 of these validated miRNAs (
Supplemental Table S2), which means these binding sites would be less accessible in HeLa cells. The four local PARIS interactions spanning nts 1897–1941 potentially decrease binding site availability for miR-145-5p [
62]. miR-145 has been shown to inhibit HeLa cell proliferation by targeting the FSCN1 mRNA transcript [
63]. miR-145 is reportedly a tumor suppressor in HeLa cells via the regulation of several proteins, including CDKs and Cyclin D1 [
64]. Although miR-145 is downregulated in HeLa cells [
65], its function relative to expected changes in the HeLa-MALAT1 structure raise questions regarding the full role of miR-145 in HeLa cells. The remaining PARIS interactions indicate structural rearrangement of binding sites for 24 other supported miRNAs, including miR-200b-3p, miR-20a-5p, and miR-106b-5p [
62]. miR-200b is upregulated in cervical cancer and aids cervical cancer metastasis by downregulating FOXG1 [
66]. Likewise, miR-20a is upregulated in HeLa cells and leads to the upregulation of the oncogenic protein TNKS2 in HeLa cells [
67]. miR-106b is also upregulated in HeLa cells [
68] and inhibits HeLa cell proliferation by downregulating PTEN via sponging of miR-106b by the lncRNA PTENP1 [
69]. The PARIS data suggest these three latter miRNAs will not be sponged by HeLa-MALAT1, thus possibly aiding HeLa cell growth and survival. Collectively, these studies suggest the presence of complex miRNA-lncRNA-mRNA networks that may be disrupted by changes to MALAT1 secondary structure in HeLa cells. Additional work is required to elucidate the full pathways governed by such miRNAs and to fully understand how structural changes in MALAT1 affect miRNA function in HeLa cells.
Besides miRNAs, MALAT1 has been described as forming intermolecular RNA–RNA interactions with rRNA and U1 snRNA [
70,
71]. The structural status of one of the five rRNA-binding sites (C2700) and two of the ten U1 snRNA-binding sites (nts 1825–1925 and 6985–7045) is changed in HeLa-MALAT1 (
Supplemental Table S1). Because few sites are affected, little to no significant alteration to MALAT1-mediated U1 snRNA and rRNA function is expected in HeLa cells. Additionally, protein-binding sites on MALAT1 are expected to become less available as a result of structural rearrangement throughout HeLa-MALAT1. Three METTL3/14-binding sites (nts 2412–2416, 5042–5046 and 8179–8184) and one HuR/ELAV1-binding site (nts 3248–3258) are hypothesized to undergo structural rearrangement, as indicated by the PARIS data. As previously described, the lack of affinity for ssRNA or dsRNA makes analysis of novel METTL3/14 function with regard to MALAT1 difficult [
42]. However, the METTL3/14 complex shows some increased affinity for single-stranded nucleic acids [
72], so aberrant m
6A levels are possible under such circumstances as a result of MALAT1 rearrangement. Unlike the K562 cells, loss of HuR-MALAT1 binding is expected in HeLa cells as structural rearrangement will make the HuR-binding site less available. This loss mirrors the aforementioned functions of HuR-MALAT1 binding in breast cancer [
36], suggesting HuR-MALAT1 binding may be decreased in HeLa cells in order to target CD133 expression and subsequently promote cancer progression. Because only one HuR-binding site is expected to undergo structural rearrangement, the repercussions on HuR function may be muted. Although probing of this particular pathway is needed to confirm such a hypothesis, a possible role of HuR-MALAT1 binding is more compelling in HeLa cells than in K562 cells.
Interestingly, several RNA modification sites identified in MALAT1 isolated from HeLa cells occur in structurally rearranged regions: 19 m
6A sites, five m
5C sites (C4834, C5518, C5520, C5538, and C5539), and one Am (2′-O-methyladenosine) modification site (A1909) (
Supplemental Tables S1 and S2). m
5C modifications have been found to regulate chromatin-related roles in other lncRNAs, such as HOTAIR and Xist, for this modification often occurs specifically in regions of the lncRNA that interact with chromatin-associated protein complexes [
73]. The five aforementioned m
5C sites were specifically identified in HeLa cells (see
Supplemental Table S1). All five m
5C sites in MALAT1 are clustered within 705 nucleotides of each other, with four of them clustered within 21 nucleotides (
Supplemental Table S1). Thus, because MALAT1 binds active chromatin, a novel structure in HeLa-MALAT1 may promote a distinct and cancer-specific chromatin-associated complex via m
5C [
74]. Moreover, the existence of modified nucleotides in MALAT1 and the diversity of RNA modifications, along with advances in modification detection, may result in the discovery of novel MALAT1 modifications that can be implemented as biomarkers [
75]. Thus, integrating PARIS and RNA modification data yielded insights into how RNA modifications, particularly m
5C, may influence MALAT1 function in HeLa cells.
MALAT1 has 17 SNPs [
54]. The HeLa-MALAT1 data suggest structural rearrangements for three SNP sites: rs11540782 (U1876C), rs1056816 (A4872K), and rs79910129 (G3247W) (
Supplemental Table S1). As previously stated, the exact effects of a given SNP on secondary structure can vary but often result in the disruption of duplexes and loss of secondary structure [
56]. As such, based on the PARIS data, no major cellular changes are expected in HeLa cells related to the MALAT1 SNPs. Of the 655 somatic cancer-associated mutations that have been identified in MALAT1 [
58], 102 mutations (15.6%) occur in regions predicted to undergo structural rearrangement in HeLa-MALAT1. Mutations within PARIS interactions are liable to destabilize the corresponding RNA duplexes but are also likely to disrupt the binding sites, thus decreasing binding of molecules like miRNAs. Eight miRNA seed-region binding sites in HeLa-specific PARIS interactions are altered by mutations, as are one METTL3/14-binding site and one HuR/ELAV1-binding site. A U5520 insertion alters the seed-region binding sites of three miRNAs within a long-range PARIS interaction (coordinates 5503, 5708): miR-17-5p, miR-20ab-5p, and miR-106b-5p. As discussed previously, free miR-20a is expected to aid HeLa cells via TNKS2 expression [
67]. Both miR-17-5p and miR-106b-5p are described as oncogenic in cervical cancer [
76,
77]. miR-17-5p targets TGFBR2 and stimulates proliferation, and miR-106b-5p promotes PTEN downregulation to achieve similar effects [
69]. Hence, there is the potential for somatic cancer-associated mutations to regulate MALAT1 function through structural changes in HeLa cells.