RNA modifications regulate the complex life of transcripts. An experimental approach called LAIC-seq was developed to characterize modification levels on a transcriptome-wide scale. In this method, the modified and unmodified molecules are separated using antibodies specific for a given RNA modification (e.g., m6
A). In essence, the procedure of biochemical separation yields three fractions: Input, eluate, and supernatent, which are subjected to RNA-seq. In this work, we present a bioinformatics workflow, which starts from RNA-seq data to infer gene-specific modification levels by a statistical model on a transcriptome-wide scale. Our workflow centers around the pulseR package, which was originally developed for the analysis of metabolic labeling experiments. We demonstrate how to analyze data without external normalization (i.e., in the absence of spike-ins), given high efficiency of separation, and how, alternatively, scaling factors can be derived from unmodified spike-ins. Importantly, our workflow provides an estimate of uncertainty of modification levels in terms of confidence intervals for model parameters, such as gene expression and RNA modification levels. We also compare alternative model parametrizations, log-odds, or the proportion of the modified molecules and discuss the pros and cons of each representation. In summary, our workflow is a versatile approach to RNA modification level estimation, which is open to any read-count-based experimental approach.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited