| Literature DB >> 28934487 |
Pawel Piatkowski1, Jagoda Jablonska2, Adriana Zyla1, Dorota Niedzialek1, Dorota Matelska1, Elzbieta Jankowska1, Tomasz Walen1,3, Wayne K Dawson1, Janusz M Bujnicki1,4.
Abstract
RNA has been found to play an ever-increasing role in a variety of biological processes. The function of most non-coding RNA molecules depends on their structure. Comparing and classifying macromolecular 3D structures is of crucial importance for structure-based function inference and it is used in the characterization of functional motifs and in structure prediction by comparative modeling. However, compared to the numerous methods for protein structure superposition, there are few tools dedicated to the superimposing of RNA 3D structures. Here, we present SupeRNAlign (v1.3.1), a new method for flexible superposition of RNA 3D structures, and SupeRNAlign-Coffee-a workflow that combines SupeRNAlign with T-Coffee for inferring structure-based sequence alignments. The methods have been benchmarked with eight other methods for RNA structural superposition and alignment. The benchmark included 151 structures from 32 RNA families (with a total of 1734 pairwise superpositions). The accuracy of superpositions was assessed by comparing structure-based sequence alignments to the reference alignments from the Rfam database. SupeRNAlign and SupeRNAlign-Coffee achieved significantly higher scores than most of the benchmarked methods: SupeRNAlign generated the most accurate sequence alignments among the structure superposition methods, and SupeRNAlign-Coffee performed best among the sequence alignment methods.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28934487 PMCID: PMC5766185 DOI: 10.1093/nar/gkx631
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.SupeRNAlign workflow. Optional steps are indicated with dashed lines.
Figure 2.Graphical illustration of the SupeRNAlign workflow, using as an example a pair of two tRNA(Asn) molecules (PDB code: 3KFU, reference structure shown in dark grey; and PDB code: 4WJ4, aligned structure shown in other colors). (A) First round: result of superposition of two RNA structures treated as rigid bodies; the aligned structure is then analyzed by ClaRNet and two substructures identified are colored blue and orange. (B) Second round: result of independent superposition of two fragments of the aligned structure identified by ClaRNet onto the corresponding fragments of the reference structure; a fragment identified as ‘well superimposed’ and frozen for further iterations is indicated in cyan, while the remaining fragments will continue being subjected to superposition. (C) Third round: result of superposition of fragments that remained ‘free’ after the previous iteration, an additional region in the CCA stem is found to be ‘well superimposed’ and is colored in cyan, regions that remain above the threshold of ‘good superposition’ remains shown in blue and orange colors. (D) The final superposition, in which the single-stranded CCA terminus (in the bottom left corner) is superimposed well and colored in cyan, while the superposition of other ‘free’ fragments (now shown in gray) does not improve according to SupeRNAlign; this superposition is used to generate the final sequence alignment.
Programs analyzed in this work
| Name | Ref. | URL (http://) | Application type | Language | SSU rRNA5 | LSU rRNA6 | Output data | Algorithm |
|---|---|---|---|---|---|---|---|---|
| ARTS | ( |
| Standalone & server | Compiled | + | + | Two pdb files2 | Structurally similar tuples of P atoms |
| SARA | ( |
| Standalone & server | Python | – | – | Single pdb file3 | Unit-vector structural representation |
| LaJolla | ( |
| Standalone | Java | – | – | List of pdb files4 | n-gram query-target matching |
| R3D Align | ( |
| Standalone & server | MATLAB1 | + | + | Single pdb file3 | ‘Maximum-clique’ |
| SETTER | ( |
| Standalone & server | Compiled | + | + | Rotation/ translation data | Generalized secondary structure units |
| iPARTS | ( |
| Server | N/A | – | – | Single .pdb file3 | Discretized structural alphabet |
| SARA-Coffee | ( |
| Standalone & server | Compiled | – | – | Sequence alignment | SARA + R-Coffee |
| Rclick | ( |
| Server | N/A | + | – | Two .pdb files2 | Clique matching, based on CLICK |
| Supe RNAlign | this work |
| Standalone & server | Python | + | + | Single .pdb file3 | Iterative superpositions of structural fragments |
| Supe RNAlign-Coffee | this work |
| Standalone | Python | + | + | Sequence alignment | SupeRNAlign + R-Coffee |
1Can be executed under GNU Octave.
2Each output file contains one structure.
3The output file contains both superimposed structures.
4Each output file contains one structure; multiple superposition models are produced.
5Program did (‘+’) or did not align (‘–’) SSU rRNA structures in the specified time (12 h).
6Program did (‘+’) or did not align (‘–’) LSU rRNA structures in the specified time (12 h).
Figure 3.A comparison of the accuracy of benchmarked methods. These boxplots show the distribution of scores (A, sum-of-pairs; B, RMSD (in Å, shown in logarithmic scale) obtained by the RNA superposition methods. Boxes mark quartiles (Q1, median, Q3); whiskers stretch from 1st to 99th percentile; outliers are shown as dots.
Figure 4.A comparison of the accuracy of benchmarked methods within RNA families. The plots show scores (A, sum-of-pairs; B, RMSD (in Å, logarithmic scale) obtained by the benchmarked programs for each RNA family. Each symbol represents the median value of score for the particular family—different programs are marked with colors and symbols. SupeRNAlign and SupeRNAlign-Coffee are denoted in black. The families where either SupeRNAlign or SupeRNAlign-Coffee performed best are marked with red dots. The families are sorted alphabetically, and this sorting order is consistent with the order in the tables to facilitate comparison of results.