Potential Misidentification of Natural Isomers and Mass-Analogs of Modified Nucleosides by Liquid Chromatography–Triple Quadrupole Mass Spectrometry

Triple quadrupole mass spectrometry coupled to liquid chromatography (LC-TQ-MS) can detect and quantify modified nucleosides present in various types of RNA, and is being used increasingly in epitranscriptomics. However, due to the low resolution of TQ-MS and the structural complexity of the many naturally modified nucleosides identified to date (>160), the discrimination of isomers and mass-analogs can be problematic and is often overlooked. This study analyzes 17 nucleoside standards by LC-TQ-MS with separation on three different analytical columns and discusses, with examples, three major causes of analyte misidentification: structural isomers, mass-analogs, and isotopic crosstalk. It is hoped that this overview and practical examples will help to strengthen the accuracy of the identification of modified nucleosides by LC-TQ-MS.


Introduction
Research into post-transcriptional RNA modification is increasingly focusing on its critical impacts on RNA decay, translational efficiency, subcellular localization, quality control, RNA-protein interactions, and disease development [1][2][3]. Intensely mined information from the epitranscriptome reveals that dynamic modification of RNA is a response to physiological and environmental changes, although the biological consequences of such modifications are still being elucidated [3][4][5]. Methylation of key transcripts, for example, N 6 -methyladenosine (m 6 A), has been observed in budding yeast and was crucial to the initiation of meiosis during nitrogen starvation [6]. YTHDF1, an m 6 A reader protein that enhances translational efficiency by recruiting eukaryotic initiation factor 3, was found to be concentrated on stress granules and triggered stalled translation when arsenite was added to induce oxidative stress in HeLa cells [7]. YTHDF2 (an m 6 A reader) and FTO (an m 6 A eraser) competitively bind to the 5 -UTR of messenger (m)RNA and regulate m 6 A methylation in a mouse embryonic fibroblast (MEF) cell line, which facilitates capindependent translation of specific transcripts under stress conditions [8]. On transfer RNAs (tRNAs), modifications at positions 34 (the wobble position), 37, and 58 can be dynamic and related to environmental factors. When taurine supply was limited, τm 5 U34 (5-taurinomethyluridine) and τm 5 s 2 U34 (5-taurinomethyl-2-thiouridine) on five mitochondrial transfer (mt)RNAs switched to cmnm 5 U34 (5-carboxymethylaminomethyluridine) and cmnm 5 s 2 U (5-carboxymethylaminomethyl-2-thiouridine) with unknown consequences [9].
Bicarbonate-free air culturing of HEK293T cells triggered the downregulation of t 6 A37 (N 6threonylcarbamoyladenosine) on mt-tRNAs decoding ANN codons and probably impaired the translation of mitochondrial respiratory complex I [10]. m 1 A58 (1-methyladenosine) in human cytoplasmic tRNAs was influenced by the glucose concentration of the culture medium via an FTO (m 1 A eraser)-dependent pathway that adjusted the decoding preference [11].
To determine the identity, quantity, and location of RNA modifications, RNA sequencing (NGS or NNGS approaches) [12][13][14], oligonucleotide mass spectrometry [15,16], and nucleoside mass spectrometry [17] are frequently used. Nuclear magnetic resonance spectroscopy has also recently been used to observe the dynamic incorporation of RNA modifications into nascent tRNA [18,19]. A generic protocol to qualitatively and quantitatively analyze modified nucleosides has been developed using high-performance liquid chromatography coupled to triple quadrupole mass spectrometry (LC-TQ-MS) [20]. Total RNA or a purified fraction is hydrolyzed and dephosphorylated to the free nucleosides using alkaline phosphatase, phosphodiesterase I, and nuclease P1. Two-step digestion is recommended as basic pH is suitable for alkaline phosphatase, while phosphodiesterase I and nuclease P1 are more effective in acidic conditions. However, some studies recommend an acidic environment for the complete digestion because certain RNA modifications are sensitive to pH. For instance, ct 6 A (cyclic N 6 -threonylcarbamoyladenosine) quickly epimerizes to t 6 A in mild alkaline buffer, causing an 18 Da mass increase [21]. The enzymes are then removed from the digestion mixture using a 10 kDa cut-off centrifugal filter unit, and the resulting nucleosides are vacuum-dried and dissolved in an appropriate solvent, usually 90% (v/v) acetonitrile or water, depending on the downstream liquid chromatography. Hydrophilic interaction liquid chromatography (HILIC) [22] and reversed-phase chromatography (RPC) [23] with semi-micro flow rates (0.1-1 mL/min) have been used to separate nucleosides, while micro-flow HPLC (5-50 µL/min) has recently been recommended for higher sensitivity without loss of reproducibility [24].
Multiple reaction monitoring (MRM) mode is used with TQ-MS when determining modified nucleosides. Collision-induced dissociation of the N-β-glycosidic bond under the appropriate collision energy (CE) yields the nucleobase production, BH 2 + . Ideally, synthetic standards of the modified nucleosides should be used to confirm optimum CE, ion mass-to-charge ratios (m/z), specific product ions and retention times, but their preparation is expensive, time consuming and labor intensive. The m/z values for precursor (MH + ) and product (BH 2 + ) ions can be readily calculated from their chemical structures ( Figure 1). CEs for detecting modified nucleosides can be estimated using native (unmodified) adenosine (A), uridine (U), cytidine (C), and guanosine (G) standards. This may not yield exact values for the modified nucleosides but is a practical compromise. Some published LC-TQ-MS applications for RNA modifications use fewer than 20 modified nucleoside standards and determine other nucleosides using calculated MRM transitions and estimated CE values [20,25].
Over 160 natural RNA modifications have been identified to date [26]. However, this complexity and the low resolution of TQ-MS (approximately 0.5 Da) can give rise to three types of misidentification when using MH + and BH 2 + MRM transitions to determine nucleosides. Type I mistakes result from regioisomers. For instance, five natural monomethylated adenosines have been identified. These are m 1 A, m 2 A (2-methyladenosine), m 6 A, m 8 A (8-methyladenosine), and Am (2 -O-methyladenosine) (Figure 2), and have been found variously in mRNA, tRNA, and ribosomal (r)RNA [26]. Am can be discriminated by monitoring the transition m/z 282.1→136 because of its unmodified nucleobase, while m 1 A, m 2 A, m 6 A, and m 8 A all give rise to the transition m/z 282.1→150 (Figure 2). High-resolution mass spectrometry can discriminate between these four monomethylated derivatives via insource, collision-induced dissociation (CID) and negative mode ionization that produces unique patterns of fragments [27]. However, this approach requires a library of MS 2 and MS 3 spectra for each isomer and sensitivity can be reduced for quantitative analysis.  Although synthetic standards can strengthen the identification of nucleoside isomers, some cannot be separated (or are closely eluted) by liquid chromatography because of their structural similarity. Type II mistakes with TQ-MS are nucleoside misidentifications due to similar masses (<0.5 Da mass differences). m 6,6 A (N 6 ,N 6 -dimethyladenosine) and f 6 A (N 6 -formyladenosine) exhibit MH + masses of 296.1359 and 296.0995, respectively. Both are modified on the nucleobase (Figure 3). Low-resolution TQ-MS of m/z 296.1→164 cannot distinguish between these two compounds. Furthermore, m 6,6 A has an isomer-m 2,8 A (2,8-dimethyladenosine)-which can lead to the complicated situation of f 6 A/m 6,6 A being present in a eukaryotic total RNA sample, or m 6,6 A/m 2,8 A being present in a eubacterial rRNA sample [26]. Type III misidentification arises from isotopic crosstalk, which is often not considered. For instance, the low-resolution isotopic distributions of positively ionized adenosine and inosine are shown in Figure 4. The transition 269.1→137 used to monitor inosine is subject to interference from the same transition arising from an isotopologue of adenosine. When adenosine and inosine are eluted simultaneously by a short HPLC method, the signal for inosine would be amplified significantly by isotopic mass of adenosine. Given the abundance of isotopic masses of small molecules composed of the elements C, H, O, N, and S, nucleosides differing in mass by one or two units are likely to interfere with each other. Based on the natural abundance of isotopes of these elements (13C 1.11%, 2H 0.0115%, 18O 0.205%, 15N 0.364%, and 34S 4.21%), such mass differences mainly arise from 13C and 34S. Such mass-analogs are not rare among naturally modified nucleosides: crosstalk between mcm 5 U (5-methoxycarbonylmethyluridine, 317.1→185) [28], nchm 5 U (5-carbamoylmethyl-2-thiouridine, 318.1→186) [29], and cm 5 s 2 U (5-carboxymethyl-2thiouridine, 319.1→187) [30] is a good example and is illustrated in Figure 5.
These three types of misidentifications can occur in the same RNA sample; thus, correctly identifying a nucleoside by TQ-MS is not straightforward. This study illustrates this complexity by describing the analysis of 17 nucleosides and modified nucleosides using three commercially available liquid chromatography columns. Ongoing discussion of the separation of commonly modified nucleosides using reversed-phase and HILIC chromatography, and of the signals derived from TQ-MS, will help to minimize the misidentifications described above.
Stock solutions of nucleosides (10-100 mM) were prepared in dimethyl sulfoxide and stored at −20 • C. Mixed standard solutions were prepared by diluting stock solutions with 0.1% formic acid in acetonitrile/water (90/10, v/v) for HILIC chromatography, or water containing 0.1% formic acid for reversed-phase chromatography.

Analytical Chromatography Columns
Three analytical columns were evaluated for their ability to resolve modified nucleosides over a 50 min chromatographic runtime: Acquity BEH amide (1.7 µm, 2.1 mm × 150 mm; Waters), Discovery HS F5 (3 µm, 2.1 mm × 150 mm; Supelco), and Atlantis T3 (3 µm, 2.1 mm × 150 mm; Waters) ( Figure 6). The Acquity BEH amide (HILIC) column was packed with ethylene-bridged hybrid particles covalently attached by trifunctionallybonded amide groups, while the linker structure of BEH amide is not published by the Waters Corporation. The Discovery HS F5 column was filled with spherical silica gel and a propyl spacer-linked pentafluorophenyl (PFP) stationary phase. The Atlantis T3 column was an octadecyl silica-based (ODS), reversed-phase C18 column with optimized pore diameter, C18-ligand density, and end-capping. Excellent performance in the separation of highly polar chemicals, including carbohydrates and nucleoside triphosphates, has been reported using these columns [31][32][33].

Adenosine Derivatives
One pmol each of m 1 A, m 2 A, m 6 A, m 1 I (1-methylinosine), m 6,6 A, and f 6 A were mixed and resolved on the three columns (Figure 7). Positive mode mass pair 282.1→150 was monitored to detect m 1 A, m 2 A, and m 6 A, 283.1→151 for m 1 I, and 296.1→164 for m 6,6 A and f 6 A. The methylated adenosines were eluted in the order m 6 A/m 2 A/m 1 A under HILIC conditions, but m 1 A/m 2 A/m 6 A under PFP and ODS conditions ( Figure 7A). With the PFP column, m 2 A and m 6 A did not achieve baseline separation, which may lead to type I misidentification.
When m/z 283.1→150 was used to detect m 1 I, type III misidentification (isotopic crosstalk) could occur. Interfering signals from isotopes of m 1 A, m 2 A, and m 6 A were evident, adjacent to the primary m 1 I peak ( Figure 7B). In the case of HILIC separation, a small m 6 A peak eluted close to m 1 I, while a m 2 A peak co-eluted with m 1 I under PFP conditions. Isotopic crosstalk should be carefully ruled out, especially when the target analyte is in low abundance and may be masked by an isotopologue of another nucleoside.  Figure 7C), although baseline separation was achieved. TQ-MS is not sufficiently sensitive to distinguish the mass difference between these two nucleosides (0.0364 Da); thus, the transition 296.1→164 acquired signals from both compounds. It is noteworthy that, although one pmol of each standard was injected, the signal for f 6 A was much lower than for m 6,6 A. This can be explained by the N 6 -formyl group of the f 6 A nucleobase tending to be deprotonated rather than protonated. This detection weakness under positive mode ionization increases the chance of f 6 A being misidentified as m 6,6 A in the absence of synthetic standards. Therefore, it is recommended to use a PFP or ODS column to detect m 6,6 A and f 6 A because of the large difference in retention times under these conditions ( Figure 7C).

Uridine and Cytidine Derivatives
The m/z differences between protonated uridine and cytidine (m/z 245.1 and 244.1), and between their nucleobases (m/z 113 and 112) are approximately one. Thus, monitoring of uridine derivatives can be subject to inferring signals from isotopes of cytidine derivatives. For example, the transition m/z 259.1→127 can detect m 5 U (5-methyluridine), m 3 U (3-methyluridine), and isotopes of m 5 C (5-methylcytidine), m 4 C (N 4 -methylcytidine), and m 3 C (3-methylcytidine). Type I, II, and III misidentifications are possible, in some cases, in a single chromatogram monitoring uridine derivatives. Two standard mixtures were prepared to illustrate this complexity: firstly, 1 pmol each of U (uridine), C (cytidine), and Y (pseudouridine); and secondly, 1 pmol each of s 2 U (2-thiouridine), s 4 U (4-thiouridine), ho 5 U (5-hydroxyuridine), m 5 D (5-methyldihydrouridine), and s 2 C (2-thiocytidine). Figure 8A illustrates the monitoring of uridine (m/z 245.1→113) with isotopic crosstalk from cytidine. However, although the mass of pseudouridine is identical to uridine, the protonated nucleobase of pseudouridine is not seen. Instead, pseudouridine is uniquely identified by the transitions m/z 245.1→209/179/155. The CID fragmentation of pseudouridine is shown in Figure 8B [34]. The product ions of pseudouridine derivatives, such as m 1 Y (1-methylpseudouridine) and Ym (2'-O-methylpseudouridine), can be predicted using this fragmentation pattern to avoid interference with uridine derivatives.   Type I, II, and III potential misidentifications can be observed simultaneously in the mixture of s 2 U, s 4 U, ho 5 U, m 5 D, and s 2 C standards. Figure 8C shows how peaks for all five modified nucleosides appear in one MRM channel (m/z 261.1→129): s 2 U and s 4 U are structural isomers (type I); s 2 U, s 4 U, ho 5 U, and m 5 D have similar precursor and product ion masses (<0.5 Da difference) (type II); and a cross-talking isotopologue of s 2 C can be observed in the primary s 2 U/s 4 U/ho 5 U/m 5 D channel (type III).

Guanosine Derivatives
Obstacles to the identification of guanosine derivatives center on the possible isomers, particularly among the monomethylated (m 1 G, m 2 G, m 7 G), dimethylated (m 2,2 G, m 2,7 G), and trimethylated (m 2,2 Gm, m 2,7 Gm) guanosines. Figure 9 shows the elution sequence of m 1 G, m 2 G, and m 7 G on various columns, from which the elution sequence of the dimethylated and trimethylated derivatives can be extrapolated. Figure 9 also illustrates the importance of analytical column choice, with only HILIC being capable of resolving the monomethylated guanosines.

Conclusions
Monitoring the fragmentation of protonated nucleosides into their respective nucleobases using the MRM mode of TQ-MS is useful for detecting RNA modifications. However, it is not practical to synthesize standards for many of these modified nucleosides and, consequently, three types of misidentification can result. Firstly, structural isomers can appear in the same MRM channel and may be closely eluted under HPLC (type I misidentification). Secondly, mass-analogs differing by less than 0.5 Da might also appear in the same channel and may be confused with other analytes (type II). Finally, analytes with mass differences of one or two Da can cause isotopic crosstalk (type III). This study applied a long HPLC runtime of 50 min but some nucleosides could still not be clearly resolved on-column. Therefore, the authors suggest that reports of rapid LC-MS methods for the detection of nucleosides (typically with 5-10 min runtimes) should be examined to rule out the possibility of analyte misidentifications arising from overlapping peaks. Stable isotope labeling of nucleosides can help to prevent type II and III misidentifications, as the mass shifts due to 13C or 15N provide additional molecular composition information [35][36][37]. Potential misidentifications of frequently modified nucleosides are listed in Tables 1-3.   Modified nucleosides can contain hydrophobic nucleobases (e.g., those with isopentenyl moieties) and hydrophilic nucleobases (e.g., those with hydroxy moieties), making reversed-phase and HILIC methods suitable for the separation of different groups of nucleosides. The PFP column (Discovery HS F5) separated uridine and cytidine derivatives effectively, and the HILIC column (Acquity BEH amide) was helpful for adenosine and guanosine derivatives. The chromatographic column should be carefully chosen and the HPLC method optimized depending on the target analyte(s). The analytical sensitivity of the nucleosides under positive mode LC-TQ-MS conditions is usually A>G>C>U (uridine being difficult to protonate under positive mode ionization due to its low pKa value). Negative mode ionization, derivatization [38], and the careful selection of productions should be considered to improve sensitivity. The degradation, oxidation [39], and spontaneous chemical derivatization [40] of nucleosides during pre-treatment procedures should also be taken into account. We recommend the establishment of strict standards for nucleoside analysis.