Computational Pipeline for Reference-Free Comparative Analysis of RNA 3D Structures Applied to SARS-CoV-2 UTR Models
Abstract
:1. Introduction
1.1. Research Context
1.2. Biological Background
2. Results
2.1. Analysis of SARS-CoV-2 5-UTR Models
2.1.1. 3D Structure Data Normalization and Validation
2.1.2. Global RMSD-Based Pairwise Comparison of RNA 3D Models
2.1.3. RNA 2D Structure
2.1.4. RMSD-Based Pairwise Comparison and Clustering of RNA 3D Domains
2.1.5. The Comparison of Predicted SL2 Domain with Reference Experimental NMR 2L6I Structure (Solution Structure of Coronaviral Stem-Loop 2)
2.2. Analysis of SARS-CoV-2 3-UTR Models
2.2.1. 3D Structure Data Normalization and Validation
2.2.2. Global RMSD-Based Pairwise Comparison of RNA 3D Models
2.2.3. RNA 2D Structure
2.2.4. RMSD-Based Pairwise Comparison and Clustering of RNA 3D Domains
2.2.5. The Comparison of Predicted S2M Domain with Reference Experimental Crystal G225U Structure from SARS-CoV-1
3. Discussion
4. Materials and Methods
4.1. A Choice of Input RNA Sequences
4.2. 3D Structure Prediction Methods
4.3. Methods for RNA 3D Structure Analysis
- Step 1.
- Normalization and validation of 3D structure data.RNA 3D structure evaluation was conducted using rna-tools [62,63]. RCSB MAXIT was applied to evaluate the stereochemistry of the 3D structures submitted [64,65]. The RNAspider pipeline [7,66] was applied to identify and classify entanglements of structural elements, that is, spatial arrangements involving two structural elements, where at least one punctures the other. In this context, puncture refers to the situation in which one structural element (determined by the secondary structure of the molecule) intersects the area within the other [7].
- Step 2.
- RMSD-based global pairwise comparison of 3D RNA models. We compared the predicted models based on the root-mean-square deviation (RMSD) [67]. It was computed with the RNA QUality Assessment tool (RNAQUA) [62] for every pair of models. RMSD-based heat maps were prepared to support identification of similarities in the set of predictions. OC cluster analysis program was run with the default settings (single linkage algorithm) to calculate the centroids of the 3D structure ensembles of the RNA [68].
- Step 3.
- RNA 2D structure extraction from atom coordinate data and conservation analysis.
- Step 4.
- Determination of the 2D consensus structure.The RNAtive tool [72] together with the consensus-driven approach for the identification of domains based on the RNA secondary structure (see the identification and analysis of domains of the RNA secondary structure for more details) was used to identify a consensus on all secondary structures annotated from the 3D input models of the RNA. RNAtive was performed with the predefined confidence threshold value set to 0.51. First, the interaction network for each input 3D model of RNA was calculated. Next, a consensus-driven secondary structure was calculated taking into account all interactions for which the confidence was higher or equal to the predefined threshold.
- Step 5.
- Clustering of the 2D structure of RNA.In this step, all secondary structures considered were compared pairwise using RNAdistance [73]. As RNAdistance does not handle pseudoknots, pseudoknot-forming nucleotides were treated as unpaired bases. Based on the comparison matrix obtained, the secondary structures were clustered with DBSCAN (density-based spatial clustering of noise applications) [74], a tool for data science and machine learning. It can identify clusters of varying shapes based on a user-defined distance measure and the minimum number of points required to find in proximity to create a cluster. Dimensionality was reduced using PCA (principal components analysis) [75].
- Step 6.
- The 2D structure-based identification of RNA domains and their analysis.A two-step approach was applied to identify RNA structure domains. In the first step (driven by consensus), the secondary structures of all RNAs considered were aligned. Next, the nucleotide pairing statistics were calculated, and the consensus secondary structure was generated and encoded in the extended dot-bracket notation. The consensus obtained was divided into domains. Each continuous fragment closed by base pairs, appearing in ≥50% of the models considered, was recognized as a domain. The aim was to find the longest possible elements closed by the base pairs common to ≥50% models. In the second step (domain-boundary driven), each RNA secondary structure was recursively divided into continuous domains. We aimed to enlarge the domains previously identified and make them more accurate. This approach resulted in a larger number of domains; some of them were present in ≤50% of models, while some were overlapping or were part of the larger ones. The base pairs involved in pseudoknot formation were independently analyzed as both unpaired and paired. With pseudoknots considered, a domain was defined as a continuous fragment located between corresponding structural elements that included opening and closing pseudoknot brackets. Such a routine was performed recurrently to enable the handling of small domains nested in the larger ones. Then, a statistical analysis of the identified domains was performed. Color-scaled maps of the regions analyzed were prepared, where the localization of the domains (Y-axis) was presented within the input sequence (X-axis). To perform a detailed analysis of the results, each domain was described by residue range, sequence, secondary structure, number of residues, number of participants who submitted models supporting the domain, distribution of the number of models within modeling groups, total number of models in which the domain was identified, and list of model names. Finally, all identified domains were aligned by an RNA sequence to observe the distribution of their secondary structures over the models.
- Step 7.
- RMSD-based pairwise comparison and clustering of RNA 3D domains.All domains identified in the previous step, supported by at least three different 3D models, were selected for further analysis. For each of them, the corresponding 3D substructures were extracted from all 3D models in which the domain was identified. For each 3D substructure, a pairwise comparison of the RMSD scores was performed, and an RMSD score matrix was prepared in a color scale. Furthermore, for each RMSD matrix, the mean and standard deviations were computed. Finally, for each domain independently, clustering was performed using DBSCAN [74] with a distance parameter set to 10Å based on the RMSD matrices. For each cluster of 3D RNA substructures, we computed extreme and average RMSD together with standard deviation, the highest scoring cluster member (the centroid of the cluster), the average distance to it, and the number of models within which a given domain was present.
4.4. Computational Platform and Parameters Used to Test the Pipeline
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Qu, Z.; Adelson, D.L. Evolutionary conservation and functional roles of ncRNA. Front. Genet. 2012, 3, 205. [Google Scholar] [CrossRef] [PubMed]
- Bhan, A.; Mandal, S.S. Long noncoding RNAs: Emerging stars in gene regulation, epigenetics and human disease. ChemMedChem 2014, 9, 1932–1956. [Google Scholar] [CrossRef] [PubMed]
- Jobe, A.; Liu, Z.; Gutierrez-Vargas, C.; Frank, J. New insights into ribosome structure and function. Cold Spring Harb. Perspect. Biol. 2018, 11, a032615. [Google Scholar] [CrossRef] [PubMed]
- Hajdin, C.E.; Ding, F.; Dokholyan, N.V.; Weeks, K.M. On the significance of an RNA tertiary structure prediction. RNA 2010, 16, 1340–1349. [Google Scholar] [CrossRef]
- Bida, J.P.; Maher, L.J. Improved prediction of RNA tertiary structure with insights into native state dynamics. RNA 2012, 18, 385–393. [Google Scholar] [CrossRef]
- Gumna, J.; Zok, T.; Figurski, K.; Pachulska-Wieczorek, K.; Szachniuk, M. RNAthor—Fast, accurate normalization, visualization and statistical analysis of RNA probing data resolved by capillary electrophoresis. PLoS ONE 2020, 15, e0239287. [Google Scholar]
- Luwanski, K.; Hlushchenko, V.; Popenda, M.; Zok, T.; Sarzynska, J.; Martsich, D.; Szachniuk, M.; Antczak, M. RNAspider: A webserver to analyze entanglements in RNA 3D structures. Nucleic Acids Res. 2022, 50, W663–W669. [Google Scholar] [CrossRef]
- Popenda, M.; Szachniuk, M.; Antczak, M.; Purzycka, K.; Lukasiak, P.; Bartol, N.; Blazewicz, J.; Adamiak, R. Automated 3D structure composition for large RNAs. Nucleic Acids Res. 2012, 40, e112. [Google Scholar] [CrossRef]
- Boniecki, M.J.; Lach, G.; Dawson, W.K.; Tomala, K.; Lukasz, P.; Soltysinski, T.; Rother, K.M.; Bujnicki, J.M. SimRNA: A coarse-grained method for RNA folding simulations and 3D structure prediction. Nucleic Acids Res. 2016, 44, e63. [Google Scholar] [CrossRef]
- Dawson, W.K.; Bujnicki, J.M. Computational modeling of RNA 3D structures and interactions. Curr. Opin. Struct. Biol. 2016, 37, 22–28. [Google Scholar] [CrossRef]
- Zhao, C.; Xu, X.; Chen, S.J. Predicting RNA Structure with Vfold. Methods Mol. Biol. 2017, 1654, 3–15. [Google Scholar] [PubMed] [Green Version]
- Ponce-Salvatierra, A.; Astha, M.K.; Nithin, C.; Ghosh, P.; Mukherjee, S.; Bujnicki, J.M. Computational modeling of RNA 3D structure based on experimental data. Biosci. Rep. 2019, 39, BSR20180430. [Google Scholar] [CrossRef] [PubMed]
- Li, B.; Cao, Y.; Westhof, E.; Miao, Z. Advances in RNA 3D Structure Modeling Using Experimental Data. Front. Genet. 2020, 11, 574485. [Google Scholar] [CrossRef] [PubMed]
- Yu, H.; Qi, Y.; Ding, Y. Deep Learning in RNA Structure Studies. Front. Mol. Biosci. 2022, 9, 869601. [Google Scholar] [CrossRef] [PubMed]
- Miao, Z.; Adamiak, R.W.; Antczak, M.; Boniecki, M.J.; Bujnicki, J.; Chen, S.J.; Cheng, C.Y.; Cheng, Y.; Chou, F.C.; Das, R.; et al. RNA-Puzzles Round IV: 3D structure predictions of four ribozymes and two aptamers. RNA 2020, 26, 982–995. [Google Scholar] [CrossRef]
- Gardner, P.P.; Giegerich, R. A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinform. 2004, 5, 140. [Google Scholar] [CrossRef]
- Lukasiak, P.; Antczak, M.; Ratajczak, T.; Bujnicki, J.M.; Szachniuk, M.; Popenda, M.; Adamiak, R.W.; Blazewicz, J. RNAlyzer – Novel approach for quality analysis of RNA structural models. Nucleic Acids Res. 2013, 41, 5978–5990. [Google Scholar] [CrossRef]
- Yang, Y.; Liu, Z. A Comprehensive Review of Predicting Method of RNA Tertiary Structure. Comput. Biol. Bioinform. 2021, 9, 15–20. [Google Scholar] [CrossRef]
- Wiedemann, J.; Kaczor, J.; Milostan, M.; Zok, T.; Blazewicz, J.; Szachniuk, M.; Antczak, M. RNAloops: A database of RNA multiloops. Bioinformatics 2022, 38. [Google Scholar] [CrossRef]
- Miao, Z.; Westhof, E. RNA Structure: Advances and Assessment of 3D Structure Prediction. Annu. Rev. Biophys. 2017, 46, 483–503. [Google Scholar] [CrossRef] [PubMed]
- Nguyen, M.N.; Verma, C. Rclick: A web server for comparison of RNA 3D structures. Bioinformatics 2014, 31, 966–968. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wiedemann, J.; Zok, T.; Milostan, M.; Szachniuk, M. LCS-TA to identify similar fragments in RNA 3D structures. BMC Bioinform. 2017, 18, 456. [Google Scholar] [CrossRef]
- Parikesit, A.A.; Valeska, M.D. Comparison of similar RNA 3D structures and substructures search tools. Malays. J. Fund. Appl. Sci. 2020, 16, 408–412. [Google Scholar]
- Rangan, R.; Watkins, A.M.; Chacon, J.; Kretsch, R.; Kladwang, W.; Zheludev, I.N.; Townley, J.; Rynge, M.; Thain, G.; Das, R. De novo 3D models of SARS-CoV-2 RNA elements from consensus experimental secondary structures. Nucleic Acids Res. 2021, 49, 3092–3108. [Google Scholar] [CrossRef]
- Manfredonia, I.; Nithin, C.; Ponce-Salvatierra, A.; Ghosh, P.; Wirecki, T.K.; Marinus, T.; Ogando, N.S.; Snider, E.J.; van Hemert, M.J.; Bujnicki, J.M.; et al. Genome-wide mapping of SARS-CoV-2 RNA structures identifies therapeutically-relevant elements. Nucleic Acids Res. 2020, 48, 12436–12452. [Google Scholar] [CrossRef]
- Zhao, J.; Qiu, J.; Aryal, S.; Hackett, J.L.; Wang, J. The RNA Architecture of the SARS-CoV-2 3’-Untranslated Region. Viruses 2020, 12, 1473. [Google Scholar] [CrossRef]
- Cao, C.; Cai, Z.; Xiao, X.; Rao, J.; Chen, J.; Hu, N.; Yang, M.; Xing, X.; Wang, Y.; Li, M.; et al. The architecture of the SARS-CoV-2 RNA genome inside virion. Nat. Commun. 2021, 12, 3917. [Google Scholar] [CrossRef]
- Huston, N.C.; Wan, H.; Strine, M.S.; de Cesaris Araujo Tavares, R.; Wilen, C.B.; Pyle, A.M. Comprehensive in vivo secondary structure of the SARS-CoV-2 genome reveals novel regulatory motifs and mechanisms. Mol. Cell 2021, 81, 584–598. [Google Scholar] [CrossRef] [PubMed]
- Sun, L.; Li, P.; Ju, X.; Rao, J.; Huang, W.; Zhang, S.; Xiong, T.; Xu, K.; Zhou, X.; Ren, L.; et al. In vivo structural characterization of the whole SARS-CoV-2 RNA genome identifies host cell target proteins vulnerable to re-purposed drugs. Cell 2021, 184, 1865–1883. [Google Scholar] [CrossRef]
- Wacker, A.; Weigand, J.E.; Akabayov, S.R.; Altincekic, N.; Bains, J.K.; Banijamali, E.; Binas, O.; Castillo-Martinez, J.; Cetiner, E.; Ceylan, B.; et al. Secondary structure determination of conserved SARS-CoV-2 RNA elements by NMR spectroscopy. Nucleic Acids Res. 2021, 48, 12415–12435. [Google Scholar] [CrossRef] [PubMed]
- Lan, T.C.T.; Allan, M.F.; Malsick, L.E.; Woo, J.Z.; Zhu, C.; Zhang, F.; Khandwala, S.; Nyeo, S.S.Y.; Sun, Y.; Guo, J.U.; et al. Secondary structural ensembles of the SARS-CoV-2 RNA genome in infected cells. Nat. Commun. 2022, 13, 1128. [Google Scholar] [CrossRef]
- Rangan, R.; Zheludev, I.N.; Hagey, R.J.; Pham, E.A.; Wayment-Steele, H.K.; Glenn, J.S.; Das, R. RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses: A first look. RNA 2020, 26, 937–959. [Google Scholar] [CrossRef]
- Andrews, R.J.; Peterson, J.M.; Haniff, H.S.; Chen, J.; Williams, C.; Grefe, M.; Disney, M.D.; Moss, W.N. A map of the SARS-CoV-2 RNA structurome. NAR Genom. Bioinform. 2021, 3, lqab043. [Google Scholar] [CrossRef] [PubMed]
- Liu, P.; Li, L.; Millership, J.J.; Kang, H.; Leibowitz, J.L.; Giedroc, D.P. A U-turn motif-containing stem-loop in the coronavirus 5′ untranslated region plays a functional role in replication. RNA 2007, 13, 763–780. [Google Scholar] [CrossRef] [PubMed]
- Chen, S.C.; Olsthoorn, R.C. Group-specific structural features of the 5′-proximal sequences of coronavirus genomic RNAs. Virology 2010, 401, 29–41. [Google Scholar] [CrossRef] [PubMed]
- Tidu, A.; Janvier, A.; Schaeffer, L.; Sosnowski, P.; Kuhn, L.; Hammann, P.; Westhof, E.; Eriani, G.; Martin, F. The viral protein NSP1 acts as a ribosome gatekeeper for shutting down host translation and fostering SARS-CoV-2 translation. RNA 2021, 27, 253–264. [Google Scholar] [CrossRef]
- Vora, S.M.; Fontana, P.; Mao, T.; Leger, V.; Zhang, Y.; Fu, T.M.; Lieberman, J.; Gehrke, L.; Shi, M.; Wang, L.; et al. Targeting stem-loop 1 of the SARS-CoV-2 5’ UTR to suppress viral translation and Nsp1 evasion. Proc. Natl. Acad. Sci. USA 2022, 119, e2117198119. [Google Scholar] [CrossRef]
- Ziv, O.; Price, J.; Shalamova, L.; Kamenova, T.; Goodfellow, I.; Weber, F.; Miska, E.A. The short- and long-range RNA-RNA Interactome of SARS-CoV-2. Mol. Cell 2020, 80, 1067–1077. [Google Scholar] [CrossRef]
- Wu, H.Y.; Guan, B.J.; Su, Y.P.; Fan, Y.H.; Brian, D.A. Reselection of a genomic upstream open reading frame in mouse hepatitis coronavirus 5′-untranslated-region mutants. J. Virol. 2014, 88, 846–858. [Google Scholar] [CrossRef]
- Masters, P.S. Coronavirus genomic RNA packaging. Virology 2019, 537, 198–207. [Google Scholar] [CrossRef]
- Iserman, C.; Roden, C.; Boerneke, M.; Sealfon, R.; McLaughlin, G.; Jungreis, I.; Park, C.; Boppana, A.; Fritch, E.; Hou, Y.J.; et al. Genomic RNA Elements Drive Phase Separation of the SARS-CoV-2 Nucleocapsid. Mol. Cell 2020, 80, 1078–1091. [Google Scholar] [CrossRef] [PubMed]
- Hsue, B.; Masters, P.S. A bulged stem-loop structure in the 3′ untranslated region of the genome of the coronavirus mouse hepatitis virus is essential for replication. J. Virol. 1997, 71, 7567–7578. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hsue, B.; Hartshorne, T.; Masters, P. Characterization of an essential RNA secondary structure in the 3′ untranslated region of the murine coronavirus genome. J. Virol. 2000, 74, 6911–6921. [Google Scholar] [CrossRef]
- Goebel, S.J.; Hsue, B.; Dombrowski, T.F.; Masters, P.S. Characterization of the RNA components of a putative molecular switch in the 3′ untranslated region of the murine coronavirus genome. J. Virol. 2004, 78, 669–682. [Google Scholar] [CrossRef]
- Imperatore, J.A.; Cunningham, C.L.; Pellegrene, K.A.; Brinson, R.G.; Marino, J.P.; Evanseck, J.D.; Mihailescu, M.R. Highly conserved s2m element of SARS-CoV-2 dimerizes via a kissing complex and interacts with host miRNA-1307-3p. Nucleic Acids Res. 2022, 50, 1017–1032. [Google Scholar] [CrossRef] [PubMed]
- Goebel, S.J.; Miller, T.B.; Bennett, C.J.; Bernard, K.A.; Masters, P.S. A hypervariable region within the 3′ cis-acting element of the murine coronavirus genome is nonessential for RNA synthesis but affects pathogenesis. J. Virol. 2007, 81, 1274–1287. [Google Scholar] [CrossRef]
- Bottaro, S.; Bussi, G.; Lindorff-Larsen, K. Conformational Ensembles of Noncoding Elements in the SARS-CoV-2 Genome from Molecular Dynamics Simulations. J. Am. Chem. Soc. 2021, 143, 8333–8343. [Google Scholar] [CrossRef]
- Omar, S.I.; Zhao, M.; Sekar, R.V.; Moghadam, S.A.; Tuszynski, J.A.; Woodside, M.T. Modeling the structure of the frameshift-stimulatory pseudoknot in SARS-CoV-2 reveals multiple possible conformers. PLoS Comp. Biol. 2021, 17, e1008603. [Google Scholar] [CrossRef]
- Miao, Z.; Tidu, A.; Eriani, G.; Martin, F. Secondary structure of the SARS-CoV-2 5′-UTR. RNA Biol. 2021, 18, 447–456. [Google Scholar] [CrossRef]
- Antczak, M.; Popenda, M.; Zok, T.; Zurkowski, M.; Adamiak, R.W.; Szachniuk, M. New algorithms to represent complex pseudoknotted RNA structures in dot-bracket notation. Bioinformatics 2018, 34, 1304–1312. [Google Scholar] [CrossRef]
- Dufour, D.; Mateos-Gomez, P.A.; Enjuanes, L.; Gallego, J.; Sola, I. Structure and functional relevance of a transcription-regulating sequence involved in coronavirus discontinuous RNA synthesis. J. Virol. 2011, 85, 4963–4973. [Google Scholar] [CrossRef] [PubMed]
- Laing, C.; Schlick, T. Analysis of four-way junctions in RNA structures. J. Mol. Biol. 2009, 390, 547–559. [Google Scholar] [CrossRef] [Green Version]
- Lee, C.W.; Li, L.; Giedroc, D.P. he solution structure of coronaviral stem-loop 2 (SL2) reveals a canonical CUYG tetraloop fold. FEBS Lett. 2011, 585, 1049–1053. [Google Scholar] [CrossRef] [PubMed]
- Williams, G.D.; Chang, R.Y.; Brian, D.A. A phylogenetically conserved hairpin-type 3′ untranslated region pseudoknot functions in coronavirus RNA replication. J. Virol. 1999, 73, 8349–8355. [Google Scholar] [CrossRef] [PubMed]
- Madhugiri, R.; Fricke, M.; Marz, M.; Ziebuhr, J. Coronavirus cis-Acting RNA Elements. Adv. Virus Res. 2016, 96, 127–163. [Google Scholar]
- Robertson, M.P.; Igel, H.; Baertsch, R.; Haussler, D.; Ares, M.J.; Scott, W.G. The structure of a rigorously conserved RNA element within the SARS virus genome. PLoS Biol. 2005, 3, e5. [Google Scholar] [CrossRef]
- Li, L.; Kang, H.; Liu, P.; Makkinje, N.; Williamson, S.T.; Leibowitz, J.L.; Giedroc, D.P. Structural lability in stem-loop 1 drives a 5′ UTR-3′ UTR interaction in coronavirus replication. J. Mol. Biol. 2008, 377, 790–803. [Google Scholar] [CrossRef]
- Ryder, S.P.; Morgan, B.R.; Coskun, P.; Antkowiak, K.; Massi, F. Analysis of Emerging Variants in Structured Regions of the SARS-CoV-2 Genome. Evol. Bioinform. 2021, 17, 11769343211014167. [Google Scholar] [CrossRef]
- Zafferani, M.; Haddad, C.; Luo, L.; Davila-Calderon, J.; Chiu, L.Y.; Mugisha, C.S.; Monaghan, A.G.; Kennedy, A.A.; Yesselman, J.D.; Gifford, R.J.; et al. Amilorides inhibit SARS-CoV-2 replication in vitro by targeting RNA structures. Sci. Adv. 2021, 7, eabl6096. [Google Scholar] [CrossRef]
- Sakuraba, S.; Xie, Q.; Kasahara, K.; Iwakiri, J.; Kono, H. Extended ensemble simulations of a SARS-CoV-2 nsp1-5′-UTR complex. PLoS Comput. Biol. 2022, 18, e1009804. [Google Scholar] [CrossRef]
- Aldhumani, A.H.; Hossain, M.I.; Fairchild, E.A.; Boesger, H.; Marino, E.C.; Myers, M.; Hines, J.V. RNA sequence and ligand binding alter conformational profile of SARS-CoV-2 stem loop II motif. Biochem. Biophys. Res. Commun. 2021, 545, 75–80. [Google Scholar] [CrossRef] [PubMed]
- Magnus, M.; Antczak, M.; Zok, T.; Wiedemann, J.; Lukasiak, P.; Cao, Y.; Bujnicki, J.M.; Westhof, E.; Szachniuk, M.; Miao, Z. RNA-Puzzles toolkit: A computational resource of RNA 3D structure benchmark datasets, structure manipulation, and evaluation tools. Nucleic Acids Res. 2020, 48, 576–588. [Google Scholar] [CrossRef] [PubMed]
- Magnus, M. rna-tools.online: A Swiss army knife for RNA 3D structure modeling workflow. Nucleic Acids Res. 2022, 50, W657–W662. [Google Scholar] [CrossRef] [PubMed]
- Gelbin, A.; Schneider, B.; Clowney, L.; Hsieh, S.H.; Olson, W.K.; Berman, H. Geometric parameters in nucleic acids: Sugar and phosphate constituents. J. Am. Chem. Soc. 1996, 118, 519–529. [Google Scholar] [CrossRef]
- Carrascoza, F.; Antczak, M.; Miao, Z.; Westhof, E.; Szachniuk, M. Evaluation of the stereochemical quality of predicted RNA 3D models in the RNA-Puzzles submissions. RNA 2022, 28, 250–262. [Google Scholar] [CrossRef]
- Popenda, M.; Zok, T.; Sarzynska, J.; Korpeta, A.; Adamiak, R.W.; Antczak, M.; Szachniuk, M. Entanglements of structure elements revealed in RNA 3D models. Nucleic Acids Res. 2021, 49, 9625–9632. [Google Scholar] [CrossRef]
- Kabsch, W. A solution for the best rotation to relate two sets of vectors. Acta Crystallogr. Sect. A Cryst. Phys. Diffr. Theor. Gen. Crystallogr. 1976, 32, 922–923. [Google Scholar] [CrossRef]
- Barton, G.J. OC: A Cluster Analysis Program; University of Dundee: Dundee, UK, 2002. [Google Scholar]
- Antczak, M.; Zok, T.; Popenda, M.; Lukasiak, P.; Adamiak, R.W.; Blazewicz, J.; Szachniuk, M. RNApdbee—A webserver to derive secondary structures from PDB files of knotted and unknotted RNAs. Nucleic Acids Res. 2014, 42, W368–W372. [Google Scholar] [CrossRef]
- Rybarczyk, A.; Szostak, N.; Antczak, M.; Zok, T.; Popenda, M.; Adamiak, R.W.; Blazewicz, J.; Szachniuk, M. New in silico approach to assessing RNA secondary structures with non-canonical base pairs. BMC Bioinform. 2015, 16, 276. [Google Scholar] [CrossRef]
- Crooks, G.E.; Hon, G.; Chandonia, J.M.; Brenner, S.E. WebLogo: A sequence logo generator. Genome Res. 2004, 14, 1188–1190. [Google Scholar] [CrossRef]
- Zok, T.; Zablocki, M.; Antczak, M.; Szachniuk, M. RNAtive ranks 3D RNA models and infers the native. 2021; in press. [Google Scholar]
- Lorenz, R.; Bernhart, S.H.; Höner Zu Siederdissen, C.; Tafer, H.; Flamm, C.; Stadler, P.F.; Hofacker, I.L. ViennaRNA Package 2.0. Algorithms Mol. Biol. 2011, 6, 26. [Google Scholar] [CrossRef] [PubMed]
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; pp. 226–231. [Google Scholar]
- Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. A Math. Phys. Eng. Sci. 2016, 374, 20150202. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Modeling | Number of 5-UTR Models | Number of 3-UTR Models | Allf | ||||
---|---|---|---|---|---|---|---|
Group | 293–300 nts | 450 nts | All | Pseudoknotted * | Non-Pseudoknotted | All | Models |
Bujnicki | 5 | - | 5 | - | 5 | 5 | 10 |
Chen | 10 | - | 10 | 5 | 5 | 10 | 20 |
Das | - | 10 | 10 | 10 | - | 10 | 20 |
Ding | - | - | - | - | 10 | 10 | 10 |
Szachniuk | 5 | 4 | 9 | - | 5 | 5 | 14 |
Total | 20 | 14 | 34 | 15 | 25 | 40 | 74 |
Genomic Region | Range (nts) | Length (nts) | # Models | Min–Max RMSD (Å) | Mean RMSD (Å) (Std Dev) | Centroid | Mean RMSD (Å) to Centroid (Std Dev) |
---|---|---|---|---|---|---|---|
5-UTR | 1–293 | 293 | 34 | 16.55–64.92 | 41.05 (7.09) | RNAComposer-5UTRD_4 | 36.55 (10.26) |
3-UTR | 10–337 | 328 | 40 | 10.58–84.50 | 46.51 (9.99) | 3UTR-Das-04_1 | 41.85 (10.00) |
3-UTR npk * | 1–337 | 337 | 25 | 10.58–84.50 | 44.81 (12.18) | RNAComposer-3UTRSA1_1 | 41.45 (16.18) |
3-UTR pk * | 10–337 | 328 | 15 | 23.73–59.86 | 39.24 (7.41) | 3UTR-Das-07_1 | 35.25 (11.42) |
Domain | Range (nts) | Length (nts) | # (%) Models with a Given Domain | Mean RMSD (Å) (Std Dev) | Min–Max RMSD (Å) | Centroid | Mean RMSD (Å) to the Centroid (Std Dev) |
---|---|---|---|---|---|---|---|
SL1 | 7–33 | 27 | 34 (100%) | 4.10 (1.31) | 0.20–9.05 | 5UTR-Bujnicki-01 | 3.32 (1.09) |
SL2 | 45–59 | 15 | 34 (100%) | 3.07 (0.87) | 0.33–4.68 | 5UTR-Chen-6_8 | 2.51 (1.14) |
SL3 | 61–75 | 15 | 32 (94%) | 4.26 (1.25) | 0.08–7.69 | 5UTR-Das-07_1 | 3.43 (1.38) |
SL2+SL3 | 45–75 | 31 | 32 (94%) | 8.74 (2.50) | 1.07–14.21 | RNAComposer-5UTRD_4 | 7.54 (2.88) |
SL4 | 84–127 | 44 | 34 (100%) | 5.70 (2.00) | 1.65–12.47 | 5UTR-Bujnicki-04 | 4.63 (1.56) |
SL4a | 131–145 | 15 | 27 (79%) | 3.01 (0.93) | 0.29–6.21 | 5UTR-Chen-6_5 | 2.56 (0.76) |
SL5a | 188–218 | 31 | 34 (100%) | 4.42 (1.40) | 0.35–9.24 | 5UTR-Bujnicki-03 | 3.66 (1.03) |
SL5b | 228–252 | 25 | 34 (100%) | 4.08 (1.05) | 0.88–7.01 | 5UTR-Das-03_1 | 3.56 (0.95) |
SL5 stem | 151–182, 263–293 | 63 | 34 (100%) | 7.77 (2.43) | 1.34–15.90 | RNAComposer-5UTRE5 | 6.17 (2.47) |
4WJ (0,0,0,0) (4-way junction) | 180–185, 225–230, 250–255, 260–265 | 24 | 24 (71%) | 10.71 (4.67) | 0.58–16.56 | RNAComposer-5UTRE3 | 8.90 (6.27) |
Domain | Range (nts) | Length (nts) | # (%) Models with a Given Domain | Mean RMSD (Å) (Std Dev) | Min–Max RMSD (Å) | Centroid | Mean RMSD (Å) to the Centroid (Std Dev) |
---|---|---|---|---|---|---|---|
BSL | 26–72 | 47 | 39 (98%) | 6.17 (2.04) | 0.04–12.93 | 3UTR-Das-07_1 | 4.83 (2.26) |
BSL ext | 15–80 | 66 | 24 (60%) | 8.49 (2.54) | 2.39–16.54 | 3UTR-Ding_6 | 7.11 (3.62) |
P2 | 96–124 | 29 | 38 (95%) | 6.65 (1.90) | 0.04–12.16 | RNAComposer-3UTRSA1_5 | 5.45 (1.79) |
P2 npk | 96–124 | 29 | 23 (92%) | 5.53 (1.48) | 0.56–10.48 | RNAComposer-3UTRSA1_5 | 4.64 (1.65) |
P2 pk | 96–124 | 29 | 15 (100%) | 6.36 (2.20) | 0.04–12.16 | 3UTR-Das-06_1 | 5.23 (2.31) |
HVR hairpin | 172–186 | 15 | 35 (88%) | 2.78 (1.03) | 0.03–5.09 | 3UTR-Chen-2_3 | 2.20 (1.15) |
S2M | 195–235 | 41 | 24 (60%) | 7.82 (2.53) | 1.23–13.78 | 3UTR-Das-01_1 | 6.42 (2.49) |
HVR stem | 128–170, 268–317 | 95 | 33 (83%) | 14.11 (5.16) | 0.63–30.55 | 3UTR-Chen-1_5 | 11.23 (6.02) |
Cluster 1 | 128–170, 268–317 | 95 | 15 (38%) | 8.77 (3.56) | 0.63–14.63 | 3UTR-Chen-1_4 | 6.95 (4.70) |
Cluster 2 | 128–170, 268–317 | 95 | 5 (13%) | 8.77 (1.82) | 6.34–12.11 | 3UTR-Das-03_1 | 7.34 (3.81) |
Cluster 3 | 128–170, 268–317 | 95 | 5 (13%) | 6.12 (2.91) | 1.65–11.39 | RNAComposer-3UTRSA1_3 | 5.03 (2.83) |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gumna, J.; Antczak, M.; Adamiak, R.W.; Bujnicki, J.M.; Chen, S.-J.; Ding, F.; Ghosh, P.; Li, J.; Mukherjee, S.; Nithin, C.; et al. Computational Pipeline for Reference-Free Comparative Analysis of RNA 3D Structures Applied to SARS-CoV-2 UTR Models. Int. J. Mol. Sci. 2022, 23, 9630. https://doi.org/10.3390/ijms23179630
Gumna J, Antczak M, Adamiak RW, Bujnicki JM, Chen S-J, Ding F, Ghosh P, Li J, Mukherjee S, Nithin C, et al. Computational Pipeline for Reference-Free Comparative Analysis of RNA 3D Structures Applied to SARS-CoV-2 UTR Models. International Journal of Molecular Sciences. 2022; 23(17):9630. https://doi.org/10.3390/ijms23179630
Chicago/Turabian StyleGumna, Julita, Maciej Antczak, Ryszard W. Adamiak, Janusz M. Bujnicki, Shi-Jie Chen, Feng Ding, Pritha Ghosh, Jun Li, Sunandan Mukherjee, Chandran Nithin, and et al. 2022. "Computational Pipeline for Reference-Free Comparative Analysis of RNA 3D Structures Applied to SARS-CoV-2 UTR Models" International Journal of Molecular Sciences 23, no. 17: 9630. https://doi.org/10.3390/ijms23179630