| Literature DB >> 31640563 |
Marcin Magnus1, Kalli Kappel2, Rhiju Das3,4,5, Janusz M Bujnicki6,7.
Abstract
BACKGROUND: The understanding of the importance of RNA has dramatically changed over recent years. As in the case of proteins, the function of an RNA molecule is encoded in its tertiary structure, which in turn is determined by the molecule's sequence. The prediction of tertiary structures of complex RNAs is still a challenging task.Entities:
Keywords: RNA; RNA 3D structure prediction; RNA evolution; RNA folding; Rosetta; SimRNA
Mesh:
Substances:
Year: 2019 PMID: 31640563 PMCID: PMC6806525 DOI: 10.1186/s12859-019-3120-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1RNA families tend to fold into the same 3D shape. Structures of the riboswitch c-di-AMP solved independently by three groups: for two different sequences obtained from Thermoanaerobacter pseudethanolicus (PDB ID: 4QK8) and Thermovirga lienii (PDB ID: 4QK9) [18] for a sequence from Thermoanaerobacter tengcongensis (PDB ID: 4QLM) [19] and for a sequence from Bacillus subtilis (PDB ID: 4 W90) (the molecule in blue is a protein used to facilitate crystallization) [20]. There is some variation between structures in the peripheral parts, but the overall structure of the core is conserved
Fig. 2The workflow implemented as EvoClustRNA - as an example of a structure prediction of the ZMP Riboswitch (RNA-Puzzle 13). (1) Sequences of homologs are found for the target sequence, and an RNA alignment is prepared. (2) Using Rosetta and/or SimRNA structural models for all sequences are generated. (3) The conserved regions are extracted and clustered. (4) The final prediction of the method is the model containing the most commonly preserved structural arrangements in the set of homologs
Fig. 3The RNA-Puzzle 13 - the ZMP riboswitch. The superposition of the native structure (green) and the EvoClustRNA prediction (blue). The RMSD between structures is 5.5 Å, the prediction was ranked as the second in the total ranking of the RNA-Puzzles (according to the RMSD values)
Fig. 4The RNA-Puzzle 14 - L-glutamine riboswitch. The RMSD between the native structure (green) and the EvoClustRNA prediction (blue) is 5.5 Å
Fig. 5Core RMSD [Å] for the best 100 models for sequences of homologs with SimRNA and Rosetta. Tar stands for “Target” sequence. Adenine riboswitch: a04 (Clostridioides difficile, AAML04000013.1), a99 (Streptococcus pyogenes, AAFV01000199.1), b28 (Oceanobacillus iheyensis, BA000028.3), u51 (Bacillus subtilis, U51115.1); c-di-GMP riboswitch: gap (Clostridium tetani, AE015927.1), gba (Bacillus halodurans, BA000004.3), gbx (Peptoclostridium difficile, ABFD02000011.1), gxx (Deinococcus radiodurans, AE000513.1); TPP riboswitch: tc5 (Xanthomonas campestris, CP000050.1), tae (Geobacter sulfurreducens, AE017180.1), tb2 (Corynebacterium diphtheriae, BX248356.1), tal (Streptococcus agalactiae, AL766847.1); THF riboswitch: tha (Marvinbryantia formatexigens, ACCL02000010.1), hak (Oribacterium sinus, ACKX01000080.1), haq (metagenome sequence, AAQK01002704.1), hcp (Natranaerobius thermophilus, CP001034.1); tRNA: taf (Tetrahymena thermophila, AF396436.1), tm5 (Rana catesbeiana, M57527.1), tab (Drosophila melanogaster, AB009835.1), tm2 (Methanothermus fervidus, M26977.1); RNA-Puzzle 13: zcp (Ralstonia pickettii, CP001644.1), znc (Bradyrhizobium sp. ORS 278, CU234118.1), zc3 (Ralstonia solanacearum, CP025741.1), zza (Caulobacter sp. K31, CP000927.1); RNA-Puzzle 14: a22 (marine metagenome, AACY022736085.1), aa2 (Synechococcus sp. JA-2-3B’a(2–13), AACY020096225.1), aj6 (Cyanophage phage, AJ630128.1), cy2 (marine metagenome, AACY023015051.1) RNA-Puzzle 17: sequences were obtained from the alignment provided by [30]: s21 (2236876011_199011), hcf (HCF12C_58327), s23 (2210131864), pis (sequence experimentally investigated in [30])
Fig. 6Comparison of RMSD [Å], core RMSD [Å], and INF for variants of EvoClustRNA and controls. The boxplots are sorted according to the median. For each RNA family one point - the medoid (model with the highest number of neighbors) of the biggest (first) cluster - is shown per method
Fig. 7Core RMSD [Å] for the best 100 models for an extended set of sequences of homologs modeled with SimRNA (Purine riboswitch, RNA-Puzzle 17, THF riboswitch, cyclic-di-GMP riboswitch). Tar stands for “Target” sequence. The first four sequences are the same as in Fig. 5. used here for comparison to sequences of additional homologs. Full list of sequences and secondary structures used for modeling can be found in the Additional file 4. The horizontal line depicts the RMSD of the best model for the target sequence
Fig. 8Clustering visualized with Clanstix/CLANS for RNA-Puzzle 17 and TPP riboswitch for models generated with SimRNA. RNA-Puzzle 17 (a-c): (a) the native structure, (b) the model with the close fold to the native, detected in a small cluster, (c) the biggest cluster with the model that was selected as the final prediction by EvoClustRNA. TPP riboswitch (d-f): (d) the native structure, (e) the model with the close fold to the native (f) the biggest cluster with the model that was selected as the final prediction by EvoClustRNA