| Literature DB >> 25969746 |
Debra Knisley1, Jeff Knisley1, Chelsea Ross2, Alissa Rockney2.
Abstract
The prediction of secondary RNA folds from primary sequences continues to be an important area of research given the significance of RNA molecules in biological processes such as gene regulation. To facilitate this effort, graph models of secondary structure have been developed to quantify and thereby characterize the topological properties of the secondary folds. In this work we utilize a multigraph representation of a secondary RNA structure to examine the ability of the existing graph-theoretic descriptors to classify all possible topologies as either RNA-like or not RNA-like. We use more than one hundred descriptors and several different machine learning approaches, including nearest neighbor algorithms, one-class classifiers, and several clustering techniques. We predict that many more topologies will be identified as those representing RNA secondary structures than currently predicted in the RAG (RNA-As-Graphs) database. The results also suggest which descriptors and which algorithms are more informative in classifying and exploring secondary RNA structures.Entities:
Year: 2012 PMID: 25969746 PMCID: PMC4393061 DOI: 10.5402/2012/157135
Source DB: PubMed Journal: ISRN Bioinform ISSN: 2090-7338
Figure 6The 118 multigraphs of order 5.
Figure 1Topological invariants for RNA multigraphs of order 4.
Figure 2Variations on the Balaban index.
Figure 3Line graph invariants.
Figure 4Clustering of the 50 graphs most distant from the 18 verified as RNA-like (in red).
Figure 5Two graphs from N .