Next Article in Journal
Review of Ocular Manifestations of Joubert Syndrome
Previous Article in Journal
Homologous Recombination: To Fork and Beyond
Previous Article in Special Issue
WebCircRNA: Classifying the Circular RNA Potential of Coding and Noncoding RNA
Article Menu

Export Article

Open AccessArticle
Genes 2018, 9(12), 604; https://doi.org/10.3390/genes9120604

Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures

1
Center for Non-Coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark
2
National Centre for Biological Sciences, Tata Institute for Fundamental Research, Bangalore 560065, India
*
Author to whom correspondence should be addressed.
Received: 25 September 2018 / Revised: 28 November 2018 / Accepted: 29 November 2018 / Published: 4 December 2018
(This article belongs to the Special Issue Computational Analysis of RNA Structure and Function)
PDF [1967 KB, uploaded 4 December 2018]

Abstract

Self-contained structured domains of RNA sequences have often distinct molecular functions. Determining the boundaries of structured domains of a non-coding RNA (ncRNA) is needed for many ncRNA gene finder programs that predict RNA secondary structures in aligned genomes because these methods do not necessarily provide precise information about the boundaries or the location of the RNA structure inside the predicted ncRNA. Even without having a structure prediction, it is of interest to search for structured domains, such as for finding common RNA motifs in RNA-protein binding assays. The precise definition of the boundaries are essential for downstream analyses such as RNA structure modelling, e.g., through covariance models, and RNA structure clustering for the search of common motifs. Such efforts have so far been focused on single sequences, thus here we present a comparison for boundary definition between single sequence and multiple sequence alignments. We also present a novel approach, named RNAbound, for finding the boundaries that are based on probabilities of evolutionarily conserved base pairings. We tested the performance of two different methods on a limited number of Rfam families using the annotated structured RNA regions in the human genome and their multiple sequence alignments created from 14 species. The results show that multiple sequence alignments improve the boundary prediction for branched structures compared to single sequences independent of the chosen method. The actual performance of the two methods differs on single hairpin structures and branched structures. For the RNA families with branched structures, including transfer RNA (tRNA) and small nucleolar RNAs (snoRNAs), RNAbound improves the boundary predictions using multiple sequence alignments to median differences of −6 and −11.5 nucleotides (nts) for left and right boundary, respectively (window size of 200 nts).
Keywords: RNA secondary structure; RNA structure boundary; RNA domain; non-coding RNA gene finder RNA secondary structure; RNA structure boundary; RNA domain; non-coding RNA gene finder
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Supplementary material

SciFeed

Share & Cite This Article

MDPI and ACS Style

Sabarinathan, R.; Anthon, C.; Gorodkin, J.; Seemann, S.E. Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures. Genes 2018, 9, 604.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Genes EISSN 2073-4425 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top