Detection and Analysis of RNA Ribose 2′-O-Methylations: Challenges and Solutions

Ribose 2′-O-methylation is certainly one of the most common RNA modifications found in almost any type of cellular RNA. It decorates transfer RNAs (tRNAs), ribosomal RNAs (rRNAs), small nuclear RNAs (snRNAs) (and most probably small nucleolar RNAs, snoRNAs), as well as regulatory RNAs like microRNAs (miRNAs) and Piwi-interacting RNAs (piRNAs), and finally, eukaryotic messenger RNAs (mRNAs). Due to this exceptional widespread of RNA 2′-O-methylation, considerable efforts were made in order to precisely map these numerous modifications. Extensive studies of RNA 2′-O-methylation were also stimulated by the discovery of C/D-box snoRNA-guided machinery, which insures site-specific modification of hundreds 2′-O-methylated residues in archaeal and eukaryotic rRNAs and some other RNAs. In this brief review we discussed both traditional approaches of RNA biochemistry and also modern deep sequencing-based methods, used for detection/mapping and quantification of RNA 2′-O-methylations.


Introduction
2 -O-methylation is a highly common modification in different cellular RNAs, present in transfer RNAs (tRNAs), ribosomal RNAs (rRNAs), small nuclear/small nucleolar RNAs (snRNAs/snoRNAs), as well as in microRNAs (miRNAs)/Piwi-interacting RNAs (piRNAs) and some messenger RNAs (mRNAs) (for review [1]). The enzymatic machinery implicated in 2 -O-methylation (2 -O-Me) is rather complex and diversified. RNA methylation is insured both by stand-alone protein enzymes and small nucleolar ribonucleoprotein (sno(s)RNP) complexes integrating a C/D-box sno(s)RNA guide and a common catalytic subunit (Nop1/fibrillarin). These aspects of RNA methylation as well as numerous biological functions of 2 -O-methylation are discussed in details elsewhere [1].
Ribose 2 -O-methylation (RNA 2 -O-methylation, Figure 1a,b), confers to the RNA polynucleotide chain particular physico-chemical properties and specific reactivity, which differ considerably from unmodified RNA. Most of these changes are now exploited for specific detection of ribose 2 -O-methylation. First of all, the presence of a methyl (-CH 3 ) group at the 2 -OH of the ribose preferentially stabilizes 3 -endo ribose conformation, typical for nucleotides in A-type RNA chain [2][3][4][5][6]. Secondly, methylation of the ribose 2 -OH almost completely abolishes the nucleophilic property of the 2 -OH oxygen atom, leading to a greatly increased resistance of the 3 -adjacent phosphodiester bond to alkaline hydrolysis (Figure 1a), as well as to nuclease cleavage (RNase T 2 and RNase H, for example). Thirdly, when a 2 -O-Me is present at the 3 -terminal nucleotide, the methyl group prevents the coordination with bidentate oxidative agents, such as periodate (IO 4 − ) ion, and thus protects the terminal ribose [7] (Figure 1b). Such 3 -terminal 2 -O-methylation also negatively affects the ligation efficiency by RNA ligase [8,9] and the 3 -end activity of polyA-polymerase [10,11] (Figure 1c). Even if 2 -O-Me is not directly affecting base-pairing properties of nucleotides, minor steric hindrance at the ribose moiety becomes important during primer extension, especially at low deoxynucleotide triphosphates (dNTP) concentration. Thus, under these specific conditions, many natural RNA-dependent DNA polymerases (reverse transcriptases, RT-enzymes) show sensitivity to 2 -O-methylation in the RNA template [12].

Classical Methods for 2 -O-Methylation Detection
The presence and stoichiometry of 2 -O-methylated nucleotides in RNA can be detected and quantified both by general analytical approaches of RNA biochemistry, and by specific applications exploiting particular chemical properties of 2 -O-methylation.

General Methods of RNA Analytical Chemistry
The presence of 2 -O-methylated nucleotides in RNAs was initially detected by different approaches, mostly based on general analysis of nucleotide(-side) composition of cellular RNAs. For example, total RNA hydrolysis by perchloric acid (HClO 4 ) followed by a measurement of released methanol, was proposed for specific quantification of RNA 2 -O-methylation [13]. Other specific methods from this period used partial hydrolysis followed by oligonucleotide analysis coupled with periodate oxidation [14].
Similar to all other modified nucleotides in RNAs, 2 -O-Me can be also detected by various types of chromatography [15,16] as well as by techniques of mass-spectrometry [17][18][19][20], which are not particularly specific for those modified residues, but still allow their detection and, in some instances, quantification (reviewed in [21][22][23]).
From 70-80 s, two-dimensional thin layer chromatography (2D TLC) was massively employed for separation of mononucleotide 5 -phosphates obtained by enzymatic hydrolysis of RNAs and radioactively labeled in vivo (pre-labeling) or in vitro (post-labeling) (for examples see [24][25][26]). It is noteworthy, that 5 -Nm-N-3 link is stable upon RNase T 2 hydrolysis and thus, when RNase T 2 is used, labeled di-nucleotides containing 2 -O-methylation have to be analyzed instead of mononucleotide 3 -phosphates. For the analysis of long RNAs, RNA fingerprinting analysis was frequently employed [27][28][29].
Recent examples of analytical strategies include isolation and analysis by combination of high-performance liquid chromatography with mass spectrometry (HPLC/MS) of mung bean or RNase T 1 fragments of rRNA, an approach which allowed to finalize the modification profile of yeast Saccharomyces cerevisiae and human ribosome [30][31][32][33]. Even if the analyses were done using different cell lines and certainly different conditions of culture, modern mass-spectrometry approaches [32,33] demonstrate an excellent correlation with the quantitative methylation data obtained by various RiboMethSeq protocols (R 2 > 0.98, see Supplementary Figure S1).

Specific Detection Strategies
Specific detection strategies are mostly based on particular chemical properties of 2 -O-methylated nucleotides listed above.

Increased Resistance to Alkaline or Enzymatic Hydrolysis
As mentioned above, this particular property was already extensively used in the past to isolate alkali-(or nuclease-) stable 5 -Nm-N-3 dinucleotides for global RNA modification analysis [34,35]. Later, it was noticed that when a modified RNA is subjected to a random (statistical) alkaline hydrolysis, 2 -O-Me groups generate characteristic "gaps" in the random cleavage profile, due to the relative protection of 3 -adjacent phosphodiester bond [12,[36][37][38][39]]. An analysis of such cleavage profiles can be performed either by using purified and end-labeled RNA, or directly on total RNA fraction, using specific primer-extension with RT-enzyme ( Figure 2a).

5'
3'   Resistance of 2 -O-methylated RNA to alkaline cleavage was later exploited in site-specific detection approach based on the use of RNase H [40,41] and two DNA-based ribozyme (DNAzyme) versions for specific RNA cleavage [42] (Figure 2b).

Reverse Transciptase Stop at Low Deoxynucleotide Triphosphates Concentration
With the development of RT-dependent primer extension on RNAs, new approaches emerged for the detection of 2 -O-methylation. The observation that RT-enzymes are stalled or paused at a 2 -O-Me at low low [deoxynucleotide triphosphate (dNTP)] opened a new way for specific detection [12,43,44]. These techniques were successfully applied to the analysis of various eukaryotic rRNAs [45][46][47] (Figure 3a). More recently, the same approach, although using fluorescently labeled DNA oligonucleotide, was used to complete human rRNA 2 -O-methylation mapping [48].  Quantification of low [dNTP] RT stops by quantitative reverse transcription coupled with PCR amplification (qRT-PCR), the so-called RTL-P approach [49], Figure 3b) was proposed for relative quantification of 2 -O-methylation site by site. More recently, an engineered KlenTaq RT-enzyme, specific to 2 -O-methylation, was developed. It can now replace low [dNTP] conditions, since the mutant enzyme is sensitive to 2 -O-methylations and is stalling at those residues even at normal dNTP concentrations [50].

Altered Enzymatic Activity with 2 -O-Me RNA 3 -Termini
The presence of a 2 -O-Me group may also affect the enzymatic activity at the RNA 3 -termini. First of all, a terminal 2 -O-methylation inhibits RNA ligase activity and thus reduces ligation efficiency at the 3 -end [8,9], introducing a considerable ligation bias during sequencing library preparation. This bias is rather annoying for global transcriptome-wide studies, but very useful for analysis of terminal miRNA 2 -O-methylation. Interestingly, T4 DNA ligase which can ligate RNAs in a duplex with DNA, is also sensitive to the presence of terminal RNA modifications, in particular 2 -O-methylations. Thus this property was used for the development of a ligation-based approach to analyze RNA modifications [39,51]. In addition, 2 -O-Me group at the 3 -termini reduces the efficiency of miRNA polyadenylation by polyA-polymerase [10] and this allows direct measurement of miRNA and piRNA 3 -terminal 2 -O-methylation [11].

Limitations of Classical Detection Methods
These traditional methods for RNA 2 -O-methylation analysis have numerous limitations. Many of them are rather sensitive, and, in turn, require the use of radioactively labeled RNA, while the others use fluorescent-based detection, but with considerable loss of sensitivity.

•
The required amounts of input RNA are quite substantial and a purification step is generally indispensable, making analysis possible only for highly abundant RNAs.

•
RT-based methods are relatively sensitive, but generate multiple false-positive as well as false-negative signals [12,44]. Moreover, partial methylation is difficult to detect.

•
Quantification is difficult and all approaches are laborious, time consuming and do not allow high-throughput analyses.

Deep Sequencing-based Approaches
In order to improve and accelerate detection and quantification of RNA 2 -O-methylation, three deep sequencing-based analytical approaches were proposed; all three exploit different particular properties of 2 -O-methylations ( Figure 4).

RiboMethSeq
All variants of published RiboMethSeq procedures are based on deep sequencing measurement of 2 -O-methylation-induced protection at the 3 -adjacent phosphodiester RNA bond against cleavage at alkaline conditions [52][53][54]. After a random alkaline hydrolysis and optional enrichment of short fragments, RNA pieces are converted to sequencing library using appropriate 3 -and 5 -adapter ligation protocol (Figure 4a). Sequencing is performed either in paired-end or in single read mode and obtained reads are mapped to the reference sequence using precise end-to-end alignment mode, to determine the exact locations of fragments' 5 -and 3 -ends. A number of these extremities at every position is counted and the resulting (5 /3 -or, sometimes, 5 -only) coverage profile is used to calculate protection (methylation) scores, allowing detection and rather precise quantification of the protection (methylation) level. . RNA is fragmented chemically or by nuclease and exposed to periodate oxidation. Cis-diols of unmodified ribose are readily oxidized to dialdehyde structures, while 2 -O-methylated ribose residues are resistant to treatment. Since enzymatic (or chemical) fragmentation is considerably biased, oxidation/phosphate removal cycles have to be repeated several times to get substantial enrichment of "oxidation-resistant" 3 -ends. Finally, 3 -adapter is directly ligated to the methylated RNA 3 -end, providing the signal after mapping and counting of the sequencing reads (read2 in this case).
Despite general similarity, the exact protocols used for library preparation and bioinformatics analysis pipelines are different in RiboMethSeq versions, probably explaining some minor discrepancies reported. The original RiboMethSeq procedure [52,55] used the proprietary ligation protocol, exploiting ribozymes reactivity and mutant RNA ligase for direct ligation to 5 -OH and 3 -P extremities resulting from alkaline hydrolysis. This allows to avoid minor biases related to subsequent 3 -end de-phosphorylation and 5 -end re-phosphorylation steps used in other protocols; however, the relative inefficiency of the ligation protocol imposes substantial amounts of input RNA (>1 µg, see below).
Optimization of all steps and replacement of direct ligation steps by highly efficient ligation protocol routinely used in small RNA sequencing (e.g., NEB Small RNA kit, New England Biolabs, Ipswich, MA, USA) reduced the amount of required material by almost 1000 fold and greatly simplified the whole analysis pipeline [53].
The currently implemented protocol for RiboMethSeq becomes compatible with low input amounts of RNA, does not use time-consuming and laborious gel-purification steps and provides a reliable quantification of the modification level with only moderate sequencing depth and global analysis cost.
For the moment only RiboMethSeq protocols were extensively applied to profiling and analysis of RNA 2 -O-methylation dynamics in rRNA, tRNA and other cellular RNAs (see below Applications).

2 -OMe-Seq
The 2 -OMe-Seq protocol uses the deep sequencing mapping of RT stops generated by primer extension at low [dNTP] (Figure 4b). Abortive complementary DNA (cDNA) chains obtained under these conditions are converted to sequencing library and the cDNA 3 -ends are determined by mapping to the reference sequence [56]. Comparison with normal RT extension profile at standard [dNTP] allows to exclude some false-positives hits related to RNA structure and sequence.

RibOxi-Seq and Nm-Seq
Two independently published protocols RibOxi-Seq [57] and Nm-Seq [58,59] both exploit the resistance of 2 -O-methylated 3 -terminal riboses to periodate cleavage (IO 4 − ). RNA is first randomly fragmented by a nuclease (benzonase for RibOxi-Seq or fragmentation reagent followed by end repair for Nm-Seq) leaving 5 -phosphates and 3 -OH extremities and these RNA fragments are subjected to periodate oxidation (Figure 4c). Protected 2 -O-Me 3 -termini are quite resistant to periodate, but all unmodified cis-diol riboses are destroyed and converted to dialdehydes (see Figure 1b). These oxidized riboses are not anymore competent to 3 -adapter ligation and thus, they were excluded from the generated library, allowing enrichment of only 2 -O-methylated extremities in the obtained sequencing reads. However, the nuclease or reagent used for fragmentation certainly has preferential recognition sequences and thus RNA cleavage is not really random. In addition, the cleavage exactly at the 2 -O-methylated nucleotides is highly inefficient or almost totally absent. To overcome these biases, multiple repetitive cycles of oxydation/β-elimination/de-phosphorylation are required (up to 8 cycles), considerably increasing the loss of material in these treatment steps. Therefore, substantial amount of input RNA is generally required for these applications. In addition, the presence of 2 -O-Me at the 3 -extremity is known to reduce the efficiency of 3 -adapter ligation (see above), thus further reducing the library yield and representativity.

Area of Applications
Even if deep sequencing-based methods for RNA 2 -O-methylation analysis were developed only recently, some of these approaches are already extensively used for RNA modification profiling under different physiological conditions. For the moment, the most popular application undoubtedly remains profiling of 2 -O-methylations in eukaryotic rRNAs by different versions of RiboMethSeq (Table 1). Since rRNA represents almost 90% of total RNA in almost any cell type, this analysis can be straightforwardly performed directly on total RNA, without preliminary fractionation or enrichment, and at a moderate sequencing depth and cost. Since human ribosome contains at least 110 confirmed 2 -O-methylation sites, this paves the way for studies of rRNA 2 -O-methylation dynamics under various physiological conditions and in biomedical applications on pathologies. Examples of such modulations have been recently published [60][61][62][63]. Other natural targets for RiboMethSeq are tRNAs from all organisms, since these small non-coding RNAs (ncRNAs) contain a number of known 2 -O-methylation sites. Transfer RNA analysis is more complicated than the ones for rRNA, but when positions of modified residues are known, high-throughput quantification of tRNA 2 -O-methylation can be reliably performed [64]. It was also demonstrated that with increased sequencing depth even low abundant ncRNAs (like human snRNAs) can be directly accessed in total RNA fraction [65], opening the way for analysis of other low abundant RNA types. For the moment alternative orthogonal approaches (2 -OMe-Seq, RibOxi-Seq and Nm-Seq) are less widespread, certainly due to excessive amount of input RNA required for analysis (see below), precluding their massive application in biomedical research.

Specificity and Sensitivity of 2 -O-Methylation Detection
Direct comparison of these high-throughput methods for the performance in ab initio discovery of modified residues is not easy, since their validation was generally performed with different model RNAs (most used yeast S. cerevisiae and human rRNAs) and, in addition, using different subsets of 'previously validated' modification sites. Since eukaryotic rRNA may also change its methylation status depending on the cell line used and even upon cultivation conditions and media composition, false negative hits may be also explained by such undermethylation. Moreover, different metrics for performance measurements were used, like Receiver Operating Characteristic (ROC) curve parameters (maximal Matthews Correlation Coefficient, MCC and/or Area Under the Curve, AUC), as well as more simpler threshold level-based models defining validation of sites as candidates.

Required Amount of Input RNA
High-throughput approaches differ very considerably by the required amount of input RNA for analysis. This seems to be a minor issue for many basic research projects, where a substantial amount of input RNA is easily obtained, but strongly limits application of otherwise promising techniques for biomedical projects (e.g., human clinical samples), where biological material is precious and extremely limited.
The best sensitivity was reported so far for variants of RiboMethSeq. Depending on the library preparation protocol this approach requires as low as 10 ng of total human RNA for complete analysis of rRNA 2 -O-methylations [53]. Routine analysis is performed with 50 ng, which is quite compatible with many biomedical projects. Other versions of RiboMethSeq are a bit less sensitive, however, they still fit into single digit µg range [54,60]. As anticipated, RT-based 2 -OMe-Seq provides comparable sensitivity, but still requires 2 × 2 µg for the analysis of a single sample [56]. Finally, two versions employing IO 4 − -based oxidation require the highest amount of input material, original publications used 7.5 µg (RibOxi-Seq) [57] or 10 µg (Nm-Seq) of total human RNA for rRNA analysis or the same amount of mRNA polyA-selected fraction for a transcriptome-wide study [58,59].

Required Depth of Sequencing
At the first glance, RiboMethSeq analysis requires the highest sequencing coverage since the signal is defined as a protection of a given phosphodiester bond against cleavage, compared to surrounding RNA positions. In principle, average coverage of 5 -/3 -ends of about 100 would be sufficient for reliable analysis, which is about 750,000 reads for human rRNA. However, this reasoning does not take into account irregularity of cleavage due to highly structured rRNA regions. In practice, about 20 times more raw reads are required for a reliable coverage of all rRNA positions. Thus we routinely use the coverage of 12-15 mln of raw reads for the analysis of human rRNA or tRNAs by RiboMethSeq [53,64]. Similar sequencing depth has been used by others [54,60].
Despite the expected enrichment of the signal due to specific detection of RT-stops (2 -OMe-Seq) or protected methylated RNA 3 -end (RibOxi-Seq and Nm-Seq), the reported sequencing depth for analysis appears to be quite similar, ranging from 10-15 mln of raw reads for 2 -OMe-Seq to 15-40 mln of raw reads for RibOxi-Seq and Nm-Seq [57,58].

Quantification of the Methylation Level
Precise quantification of 2 -O-methylation level and thus analysis of modification dynamics in RNA is possible only with RiboMethSeq, since the protection signal linearly depends on the methylation level [53,66]. Technical and biological replicates demonstrated that the average standard deviation for yeast or human rRNA 2 -O-methylation sites is close to 5% and only very few sites show higher dispersion. In practice >10% of difference of calculated MethScore (ScoreC in previous publications) can be considered as statistically significant. Absolute quantification of methylation level can be achieved if the exact values of MethScore in the absence of modification are known (unmodified transcripts or RNA from knock-out (KO) strains). In vitro transcripts were used for calibration of yeast and human rRNA modification levels, while yeast and Escherichia coli strains deleted for the corresponding RNA modification enzymes are useful for tRNA analysis [64,66].
In addition to RiboMethSeq, 2 -OMeSeq can also provide some relative quantification with appropriate spike-in of in vitro RNA transcripts, even if the absolute quantification remains impossible [56]. In contrast, methods based on enrichments of methylated RNA 3 -ends (RibOxi-Seq and Nm-Seq) do not provide any quantitative information.

Sequencing and Bioinformatics Issues
Different technological platforms can be used for sequencing of generated amplicons (libraries). Illumina sequencing (generally HiSeq or NextSeq sequencers) remains the most popular in the field, though PGM/Ion Proton devices are also suitable. Standard RiboMethSeq requires only single-read 50 nt sequencing mode (SR50), paired-end sequencing does not substantially improve the results. Similarly, only single read sequencing mode is in principle required for 2 -OMe-Seq, since only 3 -cDNA ends are of interest (see Figure 4b). In contrast, for RibOxi-Seq and Nm-Seq paired end sequencing is mandatory, since important information resides in the beginning of the read2 in paired-end mode.
Data treatment and analysis steps are similar in all approaches, reads' processing generally begins with trimming, followed by alignment to the reference sequence and counting of 5 -or 3 -(or both) ends of mapped reads. A special care should also be taken at the mapping step to avoid multiply mapped sequences of unknown origin.

Conclusions
Combination of traditional and deep sequencing-based approaches for RNA 2 -O-methylation analysis now opens the way for an exhaustive identification of novel modified sites in diverse cellular RNAs as well as careful investigations of RNA 2 -O-methylation dynamics under various physiological conditions and in human pathologies related to RNA modifications.