Current Technical Approaches to Study RNA–Protein Interactions in mRNAs and Long Non-Coding RNAs

: It is commonly understood that RNA-binding proteins crucially determine the fate of their target RNAs. Vice versa, RNAs are becoming increasingly recognized for their functions in protein regulation and the dynamics of RNA-protein complexes. Long non-coding RNAs are emerging as potent regulators of proteins that exert unknown RNA-binding properties and moonlighting functions. A vast array of RNA-and protein-centric techniques have been developed for the identiﬁcation of protein and RNA targets, respectively, including unbiased protein mass spectrometry and next-generation RNA sequencing as readout. Determining true physiological RNA and protein targets is challenging as RNA–protein interaction is highly dynamic, tissue-and cell-type-speciﬁc, and changes with the environment. Here I review current techniques for the analysis of RNA–protein interactions in living cells and in vitro. RNA-centric techniques are presented on the basis of cross-linking or the use of alternative approaches. Protein-centric approaches are discussed in combination with high-throughput sequencing. Finally, the impact of mutations in RNA–protein complexes on human disease is highlighted.


Introduction
RNA-binding proteins (RBPs) dictate the life and fate of every RNA, with functions from transcription factors over single-stranded RNA binding proteins to assembly in ribonucleoprotein complexes (RNPs) such as the spliceosome. Vice versa, RNAs are emerging to regulate the functions of many cognate binding proteins. Long non-coding RNAs (lncRNAs) have been demonstrated to act as scaffolds in the organization of RNP assembly, recruit transcription factors or chromatin modifying complexes, and small nuclear RNAs (snRNAs) are functional components of the spliceosome [1,2]. lncRNAs may have indirect functions by weakly associating with RNAs, chromatin or RBPs to contribute to condensate formation to support gene regulation. Thus, RBPs and RNAs mutually influence each other's fate. Canonical RBPs exhibit common RNA-binding domains such as the hnRNP K homology (KH) domain, RNA recognition motif (RRM), Zinc finger and DEAD box helicase domains [3][4][5], which consist typically of α-helices often complemented by β-sheets, but increasing evidence reveals non-classical RBPs with intrinsically disordered domains that can loosely interact with RNA, proteins, or chromatin [6]. Besides their role in the formation of membrane-less organelles such as stress granules by liquid-liquid phase separation [7], the interactions to RNA through small polar amino acids in conserved RGG or YGG repeats are less specific and are also believed responsible to induce moonlighting actions of metabolic enzymes such as GAPDH [8].
The analysis of RNA-protein interactions is crucial to understand the functions of RNP complexes in cellular metabolism and disease. Especially for lcnRNAs, which are marked by low expression and instability, the identification of RNA-binding proteins requires optimized detection techniques. Here I review current and novel techniques to

In Vitro RNA-Centric Approaches
RNA-centric approaches involve the isolation of an RNA of interest by IP and identification of the bound proteins by immunoblotting or mass spectrometry. The in vitro approach is best suited for characterising known interactions, for instance by mutating those nucleotides responsible for protein binding, or validating RBP binding to the target RNA under varying cellular conditions such as growth, oxidative stress, protein depletion, or drug treatment.
In vitro pulldown approaches usually focus on an in vitro-transcribed (IVT) RNA of interest that can be immunoprecipitated in several ways, through prior modification of the RNA with affinity tags or 3 extensions, or by hybridisation of the RNA to target-specific antisense probes (Figure 1a). In addition, poly(dT) oligonucleotide capture has been applied to enrich poly(A) mRNA-associated RBPs [9][10][11]. As affinity handle, an IVT RNA can be 5 -tagged with biotin [12] or an aptamer sequence [13]. For example, a streptavidin-binding aptamer, S1, was inserted into an mRNA 3 UTR to pull down ARE-binding proteins that bound to the AU-rich element (ARE) of the reporter [14]. An alternative to the four S1 aptamer hairpins are the bacteriophage-derived PP7 and MS2 binding elements. Up to 24 copies of hairpin loop domains are inserted into reporter mRNAs or endogenous loci as a 3 extension for the recognition and immunoprecipitation by the PP7 [15] and MS2 [16] coat proteins, respectively.
interest that can be immunoprecipitated in several ways, through prior modification of the RNA with affinity tags or 3′ extensions, or by hybridisation of the RNA to target-specific antisense probes (Figure 1a). In addition, poly(dT) oligonucleotide capture has been applied to enrich poly(A) mRNA-associated RBPs [9][10][11]. As affinity handle, an IVT RNA can be 5′-tagged with biotin [12] or an aptamer sequence [13]. For example, a streptavidinbinding aptamer, S1, was inserted into an mRNA 3′ UTR to pull down ARE-binding proteins that bound to the AU-rich element (ARE) of the reporter [14]. An alternative to the four S1 aptamer hairpins are the bacteriophage-derived PP7 and MS2 binding elements. Up to 24 copies of hairpin loop domains are inserted into reporter mRNAs or endogenous loci as a 3′ extension for the recognition and immunoprecipitation by the PP7 [15] and MS2 [16] coat proteins, respectively.  RNA-centric approaches to study RNA-protein interactions. (a) In vitro-transcribed RNA tagged with biotin, aptamer or stem-loop MS2 is incubated with cell lysate or purified protein and pulled down with streptavidin beads or MS2-coat protein immobilized on beads for analysis of bound proteins. Protein analysis is performed by Western blot or mass spectrometry similar to steps 5. and 6. shown in (c). (b) UV-cross-linked RNA-protein complexes are recovered by IP with target-specific probes or poly(dT) oligonucleotides coupled to beads (upper case). In the click-chemistry-based method CARIC, metabolically labelled 5-ethinyluridine (5EU)-transcripts are clicked to biotin-azide for purification (lower case). Metabolic labelling with 4-thiouridine (4sU) serves to enhance crosslinking efficiency. Although not shown, recovered RNA-protein complexes are subjected to similar analysis steps 5. and 6. as shown in (c). (c) Proximity labelling is performed by recruiting a biotin ligase BirA in close vicinity to an RNA targeted by the fusion protein. A BoxB or MS2 stem-loop is recognised and bound by the λN peptide or MS2 coat protein, respectively. Subsequent IP of the biotinylated RBPs in complex with the RNA and elution of the RBPs allows for protein analysis by mass spectrometry or Western blotting.
After incubation with cell lysate, the proteins that stay bound to the immobilized RNA are eluted by RNase digestion or boiling in SDS buffer for further analysis by sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE) or for mass spectrometry (MS) [12]. For desthiobiotin-modified RNAs, gentle conditions of 12.5 mM biotin are sufficient to elute the RNA-protein complexes [17].
To identify RNA-interacting proteins in MS, proteins are often labelled with a mass tag to derive quantification. Labelling is achieved by metabolic labelling in stable isotopelabelled media (SILAC), chemical labelling of proteolysed peptides, or spike-in of peptide standards. Label-free direct quantification of proteins is used upon comparison to a control condition [18].

Techniques Involving Cross-Linking
In vivo approaches, on the other hand, capture an endogenous RBP usually after cross-linking it to its cognate RNA in the cell. Cells are treated with ultraviolet (UV) light or formaldehyde before purifying the RNA under denaturing conditions to retain the covalent RNA-protein complexes and dispose of weakly associated proteins. In this step, antisense oligonucleotide probes are used to hybridize to and pull down the RNA of interest. A tandem purification approach employed target-specific probes after poly(A)enrichment of mRNA [19]. Due to a cell lysis-induced RNA fragmentation step, tiled 5 -biotinylated probes were used on the long non-coding RNA Xist to retrieve 81 Xistinteracting proteins from HeLa cells, namely, Spen and hnRNPK [20]. This method, ChIRP-MS or comprehensive identification of RNA-binding proteins by mass spectrometry, also captured novel RBPs of the short snRNAs U1 and U2. RBR-ID (proteomic identification of RNA-binding regions) could map RNA-binding regions of RBPs from embryonic stem cell nuclei by efficiently cross-linking metabolically labelled 4-thiouridine (4sU)-containing transcripts to RBPs [21].
Apart from endogenous transcripts, biotin-or MS2-engineered RNAs with their complexed RBPs can be captured by co-expressing the MS2 coat protein and immobilization to beads (Figure 1b). A streptavidin-biotin pulldown strategy has also been tested from the protein side in MS2-BioTRAP. Here, the MS2 coat protein is fused to an HB tag that contains an in vivo biotinylation site and is enriched together with its MS2-tagged target mRNA. The mRNA, in turn, is complexed with the cross-linked cellular RBPs [22]. In addition to DNA or RNA probes, peptide nucleic acids (PNAs) have been used to hybridize to a target mRNAs localized in dendritic neurons [23]. UV irradiation enables the activated PNA to cross-link with proteins that directly contact the mRNA/PNA complex. mRNA-interacting proteins are then identified by MS after pulldown with an antisense biotinylated probe.
For lncRNAs, the CHART method was used to validate cognate RBPs of the lncRNAs NEAT1 and MALAT1 in human cells and of roX2 in flies by Western blot. In addition, CHART could map genome-wide binding regions of roX2 to chromatin by using desthiobiotin probes for RNA pulldown from cross-linked chromatin extracts and sequencing of the isolated DNA [17]. The lncRNAs Xist and Firre were studied by using 5 -biotinylated 90 nt-long antisense DNA oligos. Initially, this method RAP (RNA antisense purification) was developed to analyse RNA localisation to chromatin, but in combination with MS (RAP-MS) served to identify RBPs that are responsible for X-chromosome inactivation [24][25][26].
The extent, efficiency, and specificity of cross-linking depends on the choice of reagent: for RNA-protein interactions, UV light is usually superior to formaldehyde as it does not induce protein-protein bonding, captures only direct contacts ('zero distance'), and is irreversible [27]. However, as the recovery of RNA-protein complexes is often low and requires high cell input, formaldehyde can be preferrable due to higher cross-linking efficiency [28]. Cross-linking efficiency for both reagents is also biased towards certain nucleotide sequences.
To circumvent the difficulties of quantitative RNA-protein complex recovery, the in vivo method CARIC (click-chemistry-assisted RNA interactome capture) uses UV crosslinking and click-chemistry to easily pull down and enrich metabolically labelled RNAs with their cognate protein partners (Figure 1b). The researchers dually labelled RNAs with 4sU and 5-ethynyl-uridine (5EU), which enable a photoactivated cross-linking step of bound RBPs and a selective click reaction with biotin-azide, respectively. MS proteomics identified 597 human proteins including 130 previously unknown RBPs, some without known RNA-binding motifs, that targeted lncRNAs, snRNAs, and miRNAs [29]. A similar method, termed RICK (RNA interactome using click chemistry), introduced 5EU-labeling of nascent transcripts and click-chemistry-assisted pulldown for the identification of RBPs from often-neglected non-polyadenylated transcripts [30].
Moreover, there are methods like RNAcompete and RNAMaP that quantitatively measure binding kinetics of a mutational library of a given RNA to an RNA-binding protein on a DNA-sequencing platform. These approaches decompose contributions of RNA primary and secondary structure to binding affinities and determine sequence motif preferences of RBPs [31,32].

Proximity Labelling Techniques
Approaches without cross-linking exploit affinity labelling to mark proteins in the vicinity of the RNA of interest, making RBPs amenable to affinity enrichment and Western blotting or MS analysis. Proximity labelling is widely applied by using engineered biotin ligases (E. coli BirA and homologs BioID2 and RABU) [33][34][35][36] or ascorbate peroxidases (APEX, APEX2) [37,38] in combination with a tagging reagent, which is most commonly biotin. Biotin ligases convert biotin to biotinoyl-5 -AMP that selectively reacts with lysine residues, while the APEX enzymes convert biotin-phenol in the presence of H 2 O 2 to biotinphenoxyl radicals that preferentially attack tyrosine residues. The labelling enzyme is fused to an RNA-binding protein to target endogenous RNA loci and label the surrounding proteins. APEX2 is also widely used to assess the spatial proteomic landscape of cellular compartments [37,38].
In RaPID (RNA-protein interaction detection), a fusion of the biotin ligase BirA and the RNA-binding λN peptide is co-expressed with an RNA engineered with a BoxB stem-loop (Figure 1c). In biotin-containing medium, macromolecules in a realm of 20 nm around the λN-bound BoxB-RNA will be labelled [33]. RaPID is used to discover novel RNA-protein interactions, but is limited to lysine-specific biotinylation and requires the overexpression of engineered RNAs. A similar approach uses the MS2 coat protein (MCP) system to guide BirA to an mRNA that is tagged with 24 MS2 stem-loops at the 3 UTR [39]. Although this method, RNA BioID, could identify the dynamic proteome of endogenous β-actin mRNA, the method is equally limited to genetically engineered mRNA loci.
A recently developed method relies equally on biotin proximity labelling, but does so by exploiting the CRISPR-Cas system for a completely different mechanism. A biotintagged small peptide, PupE, is covalently attached to proteins surrounding the Cas-targeted RNA. This allows for affinity purification, enrichment, and MS analysis of biotin-PupElabelled RBPs [40]. Strikingly, an RNA-targeting, nuclease-inactive dCas13a protein acts as the guide to recruit the fused PafA ligase to a defined RNA locus. CRUIS, or CRISPR-based RNA-united interacting system, requires stable cell expression of the dCas13a-PafA fusion protein transfected with single-guide RNA and Biotin-PupE plasmids. CRUIS has been shown to identify novel RBPs of the lncRNA NORAD.
Equally, the CRISPR-based method CBRPP (CRISPR-based RNA proximity proteomics) detected RBPs of NORAD and β-actin mRNA. CBRPP requires a fusion of dCas13b and the smallest biotin ligase, a BirA homolog [35], to be stably expressed in HEK293T cells, and co-expression of β-actin-targeted crRNAs for guidance [41]. Low expression levels under a Tet-On 3 G promoter and incubation with biotin for 18 h guarantees to capture true proximate RBPs, which are repeatedly labelled over time, and reduces background labelled proteins, as these do not accumulate over time.

Protein-Centric Approaches
Protein-centric approaches involve targeting of a protein of interest by IP and identifying the RNAs bound to it, usually by downstream RNA-sequencing (Table 1). Depending on the method, RBP binding sites can be mapped from regions spanning up to 100 nucleotides (nt) down to single-nucleotide resolution and determination of preferred binding motifs.

Techniques Involving Cross-Linking: CLIP and Related Protocols
By far the most widely used method is cross-linking and immunoprecipitation (CLIP), which exists in multiple forms with common basic steps (Figure 2a). The initial freezing of RNA-protein bonds is achieved via an in vivo UV cross-linking step, and after cell lysis and partial RNase digestion the protein of interest is immunoprecipitated with a specific antibody [42]. After separation of the protein-RNA complexes by SDS-PAGE and transfer to a membrane, the RNA is liberated from the excised complexes by proteinase K digestion. In the original protocol, the RNA was 32 P-labelled, the cross-linked sites were read-through in RT-PCR, and binding regions were narrowed down by overlaying sequencing clusters [42]. For high-throughput sequencing (HITS-CLIP), preparation of cDNA libraries comprises universal steps, the order of which varies for each protocol [43]. First, for 3 and 5 adapter ligation phosphatase and kinase treatment are required, respectively. Reverse transcription (RT) to cDNA, PCR amplification, and massively parallel sequencing reveals RNA targets of the RBP. Of note, quantitative CLIP-seq [44] analysis is challenging as it depends on cross-linking efficiency which is different for each RNA species and sequence. The number of reads per transcript does not directly reflect the number of cross-links formed, and thus is not equivalent to the amount of protein interacting with this transcript [43]. In addition, CLIP suffers from several uncertainties: high false positive rates, limiting antibody efficiency, and high input cell numbers restrict its use to bulk mixtures of millions of cells [45]. Alternative protocols such as iCLIP, irCLIP, eCLIP, GoldCLIP, and CRAC exploit the fact that 80% of RT reads stop at the cross-link site, and use RT stops to map RBP sites at singlenucleotide resolution. The above CLIP protocols show improvements such as 5 cDNA circularisation, infrared-dye 3 adapters, on-bead adapter ligation, use of epitope-tagged RBPs, and replacement of gel purification by affinity purification, respectively [46][47][48][49][50].
In the alternative protocol PAR-CLIP (photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation), metabolically labelled RNAs containing 4sU or 6-thioguanosine (6sG) are employed as cross-linking enhancers [51]. Treatment of cells with 4sU labels 1 in 40 uridines, and the photoactivatable nucleoside enhances the UV cross-linking efficiency at 365 nm in comparison to conventional CLIP at 265 nm. Since 4sU causes reverse transcriptase to incorporate G in the opposite cDNA strand, sequencing of T-to-C mutations allows to map RBP binding motifs at cross-linking sites. In this way, the RNA recognition element (RRE) of Pumilio 2 was mapped, and thousands of mRNA targets for AGO/miRNA complexes were identified. RIP-seq is a similar protein-centric approach that leaves out the cross-linking step. Nuclear or cellular lysate containing native RNA-protein complexes is incubated with an antibody against the protein of interest, e.g., the polycomb repressor complex protein PRC2. Immunoprecipitation with beads followed by RT and sequencing of the extracted RNA identified thousands of PRC2-interacting RNAs [52].
Recently, a novel method, GECX-RNA or genetically encoded chemical cross-linking of proteins with target RNA, has applied the incorporation of unnatural amino acids into the Escherichia coli Hfq chaperone to enable cross-linking with nucleic acids directly at the active site. After antibody-assisted pulldown of the protein, reverse transcription and cDNA sequencing can detect the binding sites of the RBP site-specifically and in an accurate way [53].  UV cross-linked RNA-protein complexes are immunoprecipitated with an RBP-specific antibody, separated by size via gel electrophoresis and transferred to a membrane. Target complexes are excised and RNA is extracted by proteinase K digestion of protein components. A cDNA library protocol is followed and transcripts and RBP binding sites are identified by next-generation sequencing. Note that the order of steps 3. to 7. varies depending on the CLIP protocol. (b) TRIBE uses a fusion of the deaminase ADAR and the RBP to induce in-cell A-to-I editing near RBP binding sites. Inducible expression in distinct cell types allows identification of cell-type-specific targets and of binding regions such as 3 UTR or coding sequence. (c) STAMP detects RBP binding sites by induced mutations in RNA-sequencing reads. A fusion of deaminase APOBEC to the RBP of interest is overexpressed as stable cell line, and C-to-U conversions are inserted in the RNA in the vicinity of the RBP binding site. Subtraction of background edits yields the editing clusters on target mRNAs. (d) Combination of STAMP and TRIBE using a first RBP (RBP1) fused to APOBEC and a transfected second RBP (RBP2) fused to ADAR. Fluorescent cell sorting guarantees parallel expression. cDNA-sequencing maps co-edited sites on the same read and identifies shared RNA targets of RBP1 and RBP2. Editing percentage is defined as the number of A-to-I or C-to-U edits at a given site divided by the number of reads at that site.

Techniques without Cross-Linking: TRIBE and STAMP
An alternative to CLIP is to use the RBP of interest fused to the nucleotide deaminases ADAR or APOBEC1. RNAs bound by the RBP are deaminated close to the binding sites, and sequencing reveals the mutated transcripts as RNA targets [54,55]. One can also fuse a poly(U)-polymerase to the RBP, which adds poly(U)-tails to the 3 end of targeted RNAs and enables identification via RNA sequencing [56].
TRIBE (targets of RNA-binding protein identified by editing) uses stable expression of the deaminase ADAR fused to an RBP to direct A-to-I editing near RBP binding sites, and after mRNA isolation and cDNA sequencing identifies RBP-specific mRNA targets (Figure 2b). The method was applied to Drosophila S2 cells and to transient expression in fly brain neurons to identify cell-type-specific targets of Hrp48 and two other RBPs [54]. However, drawbacks of TRIBE are false negatives due to its low editing frequency of one edit/transcript and ADAR bias towards bulged As next to double-stranded (ds) RNA regions. To overcome the low editing frequency of TRIBE, the authors optimized the deaminase efficiency of ADAR through a E488Q mutation, which led to hyper-edited sites of >1 edit per read, and increased the numbers of RBP binding sites and of target mRNAs identified [57]. However, it still suffers the structure-and sequencing-bias of ADAR which prefers bulged As in dsRNA regions and nearest neighbour As. STAMP (surveying targets by APOBEC-mediated profiling) was developed to overcome limitations of low editing by ADAR. Overexpression of an RBP-APOBEC1 fusion leads to binding site-adjacent C-to-U editing by the deaminase. Clusters of 10-100 A mutations in RNA-sequencing reads are enriched relative to the APOBEC-only control (Figure 2c) [55]. STAMP was expanded to single-cell RNA-sequencing using the 10x Genomics bead capture system and to long-read PacBio sequencing to distinguish isoformspecific edits. STAMP was demonstrated to detect ribosome-associated targets correlating to high translation and to extract RBFOX2-binding site motifs de novo from edited regions even from single cell data. However, the edit enrichment in mRNA 3 UTRs could indicate false positives that are generated by the high levels of overexpression.
The method TRIBE-STAMP combines the ideas of TRIBE and STAMP to simultaneously analyse the binding of two RBPs in a cell and elucidate sequential binding events in time on a single mRNA target [58]. The authors applied the mutational editing by ADAR and APOBEC1 to N 6 -methyladenosine (m 6 A)-harbouring transcripts that are bound by the m 6 A reader proteins YTH-DF1, -DF2, and -DF3. Each combination of tagged DF pairs was co-expressed in HEK293T cells, e.g., DF1-APOBEC and DF2-ADAR. Then, C-to-U-and A-to-I-induced mutations were mapped by cDNA-sequencing, and co-editing events > 1 were assessed for individual reads (Figure 2d). This revealed that 40% of the mRNA targets overlapping for DF1, DF2, and DF3 exhibited co-editing, indicating that the DF paralogs bind and modify the same transcript sequentially in time. Excitingly, not only does it validate the redundancy of DF proteins [59], but it implies that DF binding to m 6 A sites does not immediately trigger mRNA degradation. Rather, one editing event, i.e., m 6 A binding of a first DF, e.g., DF1, increased the likelihood of a second editing event on the same molecule and m 6 A site by a second DF, e.g., DF3 [58].
A different method to identify RBP-targeted transcripts was developed based on the covalent tagging with uridines at the 3 end of RNAs [56]. A fusion chimera of the yeast PUF3 protein and the polyU-polymerase PUP, obtained via genome engineering in Saccharomyces cerevisiae, successfully tagged PUF3 targets in vivo and allowed identification of >400 PUF3-regulated mRNAs for mitochondrial regulation. Interestingly, the method can also reliably class transcripts by their PUF3-binding affinities, as the number of nucleotides added (1-10 Us) by PUP is directly correlated to the time that the RBP stays bound to the RNA: the longer the U-tail, the more productive the RBP binding event has been in vivo.
Recently, a novel method called RT&Tag has used reverse transcription and transposasemediated tagmentation of RNA/cDNA hybrids on immunoprecipitated nuclear extracts to map RNA-protein interactions as well as to detect chromatin-associated transcripts and RNA modifications [60]. The authors could capture the non-coding RNA roX2 from flies together with its dosage compensation complex and transcripts associated with chromatin silencing.
Other established protocols like APEX-seq have exploited proximity biotinylation of RNAs in the vicinity of a specific RBP in combination with sequencing [61]. In the presence of biotin and H 2 O 2 , the ascorbate peroxidase APEX2 labels any transcript proximal to its fusion partner eIF4A to reveal eIF4A-specific RNA interactions [62]. When combined with metabolic 4sU-labelling of nascent RNAs, the efficiency of biotin-labelling and enrichment of biotinylated RNAs was improved even further when applied to study the subcellular spatial transcriptome [63].

Discussion
Databases listing RNA-protein interactions are helpful tools to obtain information on a given target, expression pattern, RNP complex composition, binding sites, affinities, and other parameters. The most recent releases are listed: POSTAR3 is a comprehensive database comprising a large collection of RBPs and binding sites from CLIP-seq data as well as RNA structure-seq data [64]. RNAInter contains RNA interaction networks along with RNA structure and expression [65], RBP2GO lists RBPs of 13 species with interactions and functions [66], RPpocket provides RNA-protein complex interactions and an analysis of RNA-binding pocket topology [67]. Although less actual, ProNIT is useful for experimentally determined thermodynamic parameters between proteins and nucleic acids [68].
Multiple RBP mutations are associated with human disease and underline the necessity to understand RNA-protein interactions. Mutations can lead to changes in splicing, RNA binding, protein interactions, catalysis, or localization [69]. For instance, a mutation in the spinal muscular atrophy (SMA)-associated SMN1 gene leads to silencing of a splicing enhancer, exon skipping, and production of a non-functional protein [70]. Loss-of-function of a specific KH domain is associated with fragile X syndrome of mental retardation [5]. Mutations in the disordered repeat domains of FUS lead to toxic accumulation of protein aggregates in fly models and provide a rational for neurodegenerative disorders like amyotrophic lateral sclerosis (ALS) [71,72]. On the other hand, mutations in the RNA part of the RNP not only directly influence the production of an mRNA-encoded protein, but decreased expression of microRNAs or increased expression of the lncRNA HOTAIR associated with cancer [73]. HOTAIR recruitment of Polycomb repressor complex for histone methylation leads to silencing of target genes and is associated with increased metastatic invasiveness [74].
Several tools to interfere with or rescue RNA-protein interactions have been validated in animal models and some are already being applied as clinical therapeutics. Small molecules can directly bind to an RBP or RNA target. For instance, antitumour drugs such as spliceostatin target two of the U2 snRNP protein components to prevent splice site recognition [69,75]. Of note, an unbiased small molecule screening approach for RNA targets recently guided lead optimization of a novel inhibitor of the lncRNA Xist that is potent to disrupt X chromosome inactivation in a mouse cell model [76]. Antisense oligonucleotides (ASOs) are commonly used to block access of RBPs to their RNA binding sites. The ASO drug nusinersen is injected quarterly to patients suffering from SMA: in mice, the ASO-masking of the splicing silencer site in the SMN1 pre-mRNA rescued splicing and production of the functional SMN1 protein [77]. Other types of disrupting RNA-protein interactions for therapeutic intervention employ RNA interference (RNAi), which is used to trigger decay or inhibit translation of a specific mRNA target by hybridization to its 3 UTR [73,78], and aptamers, synthetically evolved RNA sequences that specifically bind to a lncRNA or an RBP [79].
Future research will certainly involve the CRISPR/Cas system to exploit the endoribonuclease activity of Cas13 to specifically target RNAs in cells. More excitingly, the RNA-recruiting function of dCas13 variants could be applied to site-specifically revert disease-relevant mutations. Since many methods rely on cross-linking but suffer from low cross-linking efficiency, alternative methods involving editing enzymes or tagging enzymes will be of high interest. These methods can be further improved and applied to other targets. Still, alternative cross-linking reagents or the 4sU-labelling approach will also be of high demand to capture RNA-protein complexes in vivo. Regardless of the methodology, every newly identified target should be cross-validated with orthogonal methods. In CBRPP, for instance, dCAS13-identified RBPs of the lncRNA NORAD were validated by the protein-centric methods RIP and CLIP [41]. Overall, it will be exciting to see applications of other RNA-manipulating enzymes for novel method developments in the study of RNA-protein interactions.