Literature DB >> 29123460

FledFold: A Novel Software for RNA Secondary Structure Prediction.

Qi Zhao1,2, Yuanning Liu1, Yunna Duan1, Tao Dai1, Rui Xu1, Hao Guo2, Daiming Fan2, Yongzhan Nie2, Hao Zhang1.   

Abstract

BACKGROUND: RNA secondary structure is essential to understand the mechanism of RNAs.
METHOD: In this paper, fledFold, a novel software for RNA secondary structure prediction, is introduced. It combines both thermodynamic and kinetic factors of RNA secondary structures and can predict RNA secondary structures from their primary sequences with local personal computers.
RESULTS: FledFold is implemented in C++ under Windows 7 and could run on windows 7 or later version with at least 2 GB of RAM. Fledfold is user friendly and could output results with multiple formats. CONSLUSION: FledFold will be a valuable tool for RNA researches and it could be downloaded freely from http://www.jlucomputer.com/fledfold.php.

Entities:  

Keywords:  Bioinformatics tools; RNA secondary structure prediction; computer; molecule structure; primary sequences; software

Year:  2017        PMID: 29123460      PMCID: PMC5652076          DOI: 10.2174/1570178614666170419122621

Source DB:  PubMed          Journal:  Lett Org Chem        ISSN: 1570-1786            Impact factor:   0.867


INTRODUCTION

Recently, researches have discovered a large number of non-coding RNAs (ncRNA) [1, 2] which serve many different roles [3], such as modulating gene expression [4], catalyzing reactions [5], immunity [6] and development [7]. It has been well known that functions of ncRNAs are deeply related to their secondary structures (Fig. ) rather than their primary sequences. Therefore, the insight of RNA secondary structures has received increasing attention. The concept of RNA secondary structure began with the work of Doty [9]. Generally, the RNA secondary structure could be defined as a set of canonical base pairs, including AU, GC and GU. Since it is often difficult to obtain X-ray diffraction [10] or nuclear magnetic resonance (NMR) spectroscopy data for RNA molecules to inspect their structures [11], predicting RNA structures from their primary sequences precisely is highly desirable. Mfold [12, 13] is the first practical programming algorithm which could predict the optimal secondary structure from a single RNA sequence. But the accuracy of mfold remains to be improved, especially when predicting long RNA sequences, such as full-length small subunit ribosomal RNA (rRNA) and large subunit rRNA [14]. Comparative analysis [15] is the most accurate method when a large number of homologous
sequences are available. However, this method needs both significant user inputs and a large number of well aligned homologous sequences. In this contribution, an alternative software, fledFold, is described. FledFold combines both thermodynamic and kinetic factors of RNA secondary structure, and could predict RNA secondary structures from primary sequences. Our prior work [8] has shown that the accuracy of fledFold is higher than that of traditional methods, especially for RNAs without pseudoknots. Hence, it would be helpful to provide available software package for users. The details of using fledFold for predicting RNA secondary structure are introduced in this paper. We believe that fledFold will be a valuable tool for RNA researches. Fledfold could be downloaded at http://www.jlucomputer.com/fledfold.php.

METHODS

The technical details of the fledFold can be found in our original publication [8], and here, we only highlight the pipeline of fledFold. FledFold combines both thermodynamics and kinetics, and was designed under the assumption that the RNA folding process from random coil state to full structure state is staged. In each folding stage, the final state of an RNA is determined by the optimal combination of helical regions which are most urgent to form under the current RNA state. FledFold utilizes the nearest neighbor (NN) model [16] to calculate the free energy of an RNA secondary structure, which assumes the free energy of an RNA secondary structure is the sum of the energy of its loops and helical regions. The thermodynamic parameters used in NN model is Turner 1999 [16]. FledFold predicts only the most likely secondary structure from a single RNA sequence, which makes it easy for the non-expert users. FledFold works in batch process pattern and the process pipeline for each RNA sequence is shown in Fig. (.
Fig. (2)

The overview of the processing procedure of fledFold.

RESULTS

The usage of fledFold is very simple, as shown in Fig. (. There is no need to configure any parameters for fledFold or upload files to any websites, and the only input of fledFold is one or a group of FASTA files (need to be put under the path 'fledFold/sequences'). At present, only one RNA sequence is allowed in each FASTA file which is presented in characters 'A'-'Z' or 'a'-'z'. If a DNA sequence is input, it will be converted into the corresponding RNA sequence automatically. In addition, all the rare bases will be converted into the corresponding bases. FledFold will report errors if illegal characters are detected in the input FASTA files. The executable file 'fledFold.exe' under the path 'fledFold/' could be run either by simply double clicking or in command line. FledFold could process 1000 FASTA files at most at one time. When the processing is completed, the output files which describe RNA secondary structure in multiple formats will be generated and saved automatically under the path 'fledFled/sequences/'.
Fig. (3)

The procedure of using fledFold.

The results of fledFold are presented in dot-parenthesis format, Connectivity Table (CT) format and Scalable Vector Graphics (SVG) format with the same name as the corresponding input FASTA file (different suffixes). Dot-parenthesis files and CT files can be used to draw RNA structure figures conveniently and SVG files could give the highest print quality no matter how they are enlarged or shrunk, which are convenient for observing the details of predicted structures. Additional plug-in software is required to open SVG files in browser, such as Adobe SVG Viewer and Corel SVG Viewer. FledFold is implemented in C++ under Windows 7 and could run on windows 7 or later version with at least 2 GB of RAM. A help document is provided in the package of fledFold to guide the users, which includes the introduction about the input and output of fledFold and some usage details. Our prior work [8] suggested that the performance of fledFold is better than other algorithms especially for the RNA sequence without pseudoknots. FledFold takes only several seconds to predict the secondary structures of sequences with length shorter than 400 nt using our computers (Processer: i5, RAM: 4G, OS: Windows 7).

CONCLUSION

FledFold is convenient for users to predict RNA secondary structures from their primary sequences. Now, the version number of fledFold is 1.0, and several work are underway to improve it. At present, fledFold can only run on a single machine, but in fact, many processes of fledFold can be executed in parallel. Therefore, the speed of fledFold will be improved significantly if distributed computation could be used. Providing distributed running capacity to fledFold is one of our future work. Only Windows version of fledFold is available at present and the development of Linux version, Mac version is ongoing. Once these versions of fledFold are completed, they will be uploaded to the same address as Windows version. In addition, fledFold cannot utilize prior knowledge, for example, the data from enzymatic cleavage [17], chemical mapping [18] or SHAPE [19]. These experiment data could significantly improve the accuracy of the prediction, hence, we plan to make use of prior knowledge to improve prediction results later version of fledFold.
  17 in total

Review 1.  NMR spectroscopy of RNA.

Authors:  Boris Fürtig; Christian Richter; Jens Wöhnert; Harald Schwalbe
Journal:  Chembiochem       Date:  2003-10-06       Impact factor: 3.164

Review 2.  Mechanisms of gene silencing by double-stranded RNA.

Authors:  Gunter Meister; Thomas Tuschl
Journal:  Nature       Date:  2004-09-16       Impact factor: 49.962

3.  ProbKnot: fast prediction of RNA secondary structure including pseudoknots.

Authors:  Stanislav Bellaousov; David H Mathews
Journal:  RNA       Date:  2010-08-10       Impact factor: 4.942

4.  A New Method to Predict RNA Secondary Structure Based on RNA Folding Simulation.

Authors:  Yuanning Liu; Qi Zhao; Hao Zhang; Rui Xu; Yang Li; Liyan Wei
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2015-11-03       Impact factor: 3.710

5.  The functional genomics of noncoding RNA.

Authors:  John S Mattick
Journal:  Science       Date:  2005-09-02       Impact factor: 47.728

Review 6.  How ribosomes make peptide bonds.

Authors:  Marina V Rodnina; Malte Beringer; Wolfgang Wintermeyer
Journal:  Trends Biochem Sci       Date:  2006-12-08       Impact factor: 13.807

Review 7.  Let me count the ways: mechanisms of gene regulation by miRNAs and siRNAs.

Authors:  Ligang Wu; Joel G Belasco
Journal:  Mol Cell       Date:  2008-01-18       Impact factor: 17.970

8.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information.

Authors:  M Zuker; P Stiegler
Journal:  Nucleic Acids Res       Date:  1981-01-10       Impact factor: 16.971

9.  RNAProfile: an algorithm for finding conserved secondary structure motifs in unaligned RNA sequences.

Authors:  Giulio Pavesi; Giancarlo Mauri; Marco Stefani; Graziano Pesole
Journal:  Nucleic Acids Res       Date:  2004-06-15       Impact factor: 16.971

10.  Probing the phenomics of noncoding RNA.

Authors:  John S Mattick
Journal:  Elife       Date:  2013-12-31       Impact factor: 8.140

View more
  1 in total

1.  Prediction of plant-derived xenomiRs from plant miRNA sequences using random forest and one-dimensional convolutional neural network models.

Authors:  Qi Zhao; Qian Mao; Zheng Zhao; Tongyi Dou; Zhiguo Wang; Xiaoyu Cui; Yuanning Liu; Xiaoya Fan
Journal:  BMC Genomics       Date:  2018-11-26       Impact factor: 3.969

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.