Literature DB >> 18440970

Software.ncrna.org: web servers for analyses of RNA sequences.

Kiyoshi Asai1, Hisanori Kiryu, Michiaki Hamada, Yasuo Tabei, Kengo Sato, Hiroshi Matsui, Yasubumi Sakakibara, Goro Terai, Toutai Mituyama.   

Abstract

We present web servers for analysis of non-coding RNA sequences on the basis of their secondary structures. Software tools for structural multiple sequence alignments, structural pairwise sequence alignments and structural motif findings are available from the integrated web server and the individual stand-alone web servers. The servers are located at http://software.ncrna.org, along with the information for the evaluation and downloading. This website is freely available to all users and there is no login requirement.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18440970      PMCID: PMC2447773          DOI: 10.1093/nar/gkn222

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Comparisons, alignments and motif identification are essential procedures for extracting valuable information from biological sequences. Many effective software tools for these purposes are available for use with amino acid and DNA sequences, but their efficiency for RNA sequences is limited because they do not accommodate analysis of possible secondary structures. Practical analyses of multiple RNA sequences in light of their secondary structures have been difficult because of their extremely high computational costs, but several algorithms have been proposed and there are a few websites of software tools that support structure-based analyses of RNA sequences, e.g. Vienna RNA Package (http://www.tbi.univie.ac.at/~ivo/RNA/), Sfold (http://sfold.wadsworth.org) and BiBiServ (http://bibiserv.techfak.uni-bielefeld.de/). Recent progress in RNA sequence analysis has created a demand for rapid and accurate structure-based analyses of multiple RNA sequences. To this end, we have developed several software tools for comparison (1,2), alignment (3–8) and motif identification (9) of multiple RNA sequences; searches for conserved miRNAs (10); prediction of common secondary structures from multiple sequence alignments (11) and calculation of base-pairing probabilities for long sequences (12). Using these software tools, we have developed an integrated web server and stand-alone web servers (software.ncrna.org) that support multiple alignment, pairwise alignment and extraction of structural motifs of RNA sequences.

METHODS

The integrated web server and the stand-alone web servers we developed offer three types of RNA sequence analyses based on common potential secondary structures: pairwise alignment, multiple alignment and structural motif extraction. SCARNA, PHMMTS (pair hidden Markov models on tree structures), PSTAG (pair stochastic tree adjoining grammar), Murlet and MXSCARNA can be used on their stand-alone web servers as well as on our integrated server. In addition, the source codes for PHMMTS, PSTAG, Murlet and MXSCARNA are available for download. The brief introductions to those tools follow, while the detailed evaluations are described in the refs (3–9) and their summaries given on web (http://software.ncrna.org).

Pairwise alignment

SCARNA (3) is a rapid structural pairwise alignment tool for RNA sequences of unknown secondary structure. This program separately aligns the 5′ and 3′ regions of stem candidates, which are extracted from each RNA sequence in light of base-pairing probabilities (12,13), by use of an engineered DP algorithm that incorporates rough consideration of consistency. We compared SCARNA with several other alignment tools by using Gardner's benchmark dataset (14) and a dataset comprising 5S ribosomal RNA, 5.8S ribosomal RNA and Hammerhead ribozyme from the Rfam database (15). The alignment accuracies of SCARNA for sequences with low similarities were not as high as those of programs that evaluate secondary structures more strictly, e.g. Foldalign (16), Dynalign (17) and PMcomp (18). However, the computational speed of SCARNA was approximately one order of magnitude faster (i.e. <1 min for 1000 bases) and allowed alignment of sequences longer than 1000 bases. PHMMTS (4,5) and PSTAG (6) are tools for aligning RNA sequences of unknown secondary structure to RNA sequences with known secondary structure. PHMMTS evaluates only pseudoknot-free structures, whereas PSTAG can accept pseudoknotted structures. When compared with ClustalW (19) by using tRNA and Hammerhead ribozyme datasets, PHMMTS was more accurate in regard to correct assignment of secondary structures. In a comparison with PHMMTS and ClustalW by using RNA sequences of HDV_ribozyme, an RNA family in PseudoBase (20) that includes pseudoknotted structures, PSTAG was more accurate in correct assignment of secondary structures.

Multiple alignment

Murlet (7) and MXSCARNA (8) are structural multiple alignment tools for RNA sequences. Murlet is based on pair SCFG (stochastic context-free grammar), has dramatically decreased computational costs, and is applicable to RNA sequences as long as 300 bases. MXSCARNA is an extension of SCARNA that offers progressive alignment and is applicable to RNA sequences as long as 5000 bases though the accuracies for those longer than 500 bases are not confirmed and the lengths are restricted to 1000 in the web server. We validated Murlet and MXSCARNA by using the BRAlibaseII benchmark dataset (14) and the dataset of Kiryu et al. (7). Both tools showed comparative accuracies in SPS (sum-of-pairs score) with ProbCons (21). The accuracies in potential common secondary structures were evaluated by MCC (Mathew's correlation coefficient), and both tools showed comparative accuracy with Stemloc (22).

Motif extraction

RNAmine (9) is a tool for extraction of structural motifs. This program uses a graph-mining technique to identify local sequences with frequent stem patterns from among a set of RNA sequences. RNAmine is currently available only on the integrated web server.

Additional tools

In addition to the six software programs described, the tools SOKOS/CAN (1), Stem Kernel (2), miRRim (10), McCaskill-MEA (11) and Rfold (12) are available for download. Although pairwise alignment is the default method, kernels can be used as similarities in an alternative approach for comparing two biological sequences. SOCOS/CAN and Stem Kernel are tools for sequence comparison, both of which use features of the potential secondary structures to calculate the kernel function. SOKOS/CAN calculates the marginalized kernel on SCFG, and Stem Kernel compares the sequences by the kernel based on all possible stem patterns. Predicting non-coding RNAs is difficult because general characteristic sequence patterns are not known. For specific families of non-coding RNAs, however, realistic predictions are possible. We developed miRRim (10) as a tool for finding conserved miRNAs. McCaskill-MEA (11) is a method used to predict consensus secondary structures from given multiple alignments. Rfold (12) is a tool for calculating the local base-pairing probabilities without using sliding windows; it is based on the full energy model of the Vienna RNA Package (23).

DESCRIPTION OF SERVICES

Table 1 shows a list of software tools available at http://software.ncrna.org/.
Table 1.

List of software tools at http://software.ncrna.org

Software toolFunctionPseudo-knotDownloadIntegrated web serverStand-alone web server

Max. no. of seq.Max. length
SCARNAPairwise alignmentYesN/A51000Yes
PHMMTSPairwise alignment (to known structure)NoC++ source51000Yes
PSTAGPairwise alignment (to known structure)YesC++ source570Yes
MXSCARNAMultiple alignmentNoC++ source101000Yes
MurletMultiple alignmentNoC++ source5300Yes
RNAmineMotif extractionNocontact10500No
SOKOS/CANSequence comparisonNoC sourceN/AN/ANo
Stem KernelSequence comparisonYesC++ sourceN/AN/ANo
miRRimmiRNA findingNoSource scriptN/AN/ANo
McCaskill-MEACommon secondary structure predictionNoC++ sourceN/AN/ANo
RfoldBase pairing probabilitiesNoC++ sourceN/AN/ANo
List of software tools at http://software.ncrna.org The integrated web server and the stand-alone web servers offer web interfaces for use of SCARNA, PHMMTS, PSTAG, Murlet, MXSCARNA and RNAmine. On the integrated web server, users can select one of the service types: multiple alignment, pairwise alignment or structural motif extraction. On the menu for ‘multiple alignment’, users can select either Murlet or MXSCARNA. Either direct input or uploading of a file of RNA sequences in multi-FASTA format is accepted. The server outputs a multiple alignment with annotations of the predicted common secondary structure, a figure of the structure and a phylogenetic tree of the sequences. On the menu for ‘pairwise alignment’, users can select SCARNA, PHMMTS or PSTAG. In SCARNA, either direct input or uploading of a file of RNA sequences in multi-FASTA format is accepted. The server outputs a pairwise alignment with annotations of the predicted common secondary structures and a figure of the structure that includes the two aligned sequences. PHMMTS and PSTAG accept either direct input or uploading of a file of RNA sequences of unknown secondary structures in multi-FASTA format as query sequences, and direct input of an RNA sequence of known secondary structure and its secondary structure in dot-bracket format as the template structure. The server outputs the result of alignments of the query sequences to the template structure with annotations of the secondary structures and the same kind of figures of the structures as SCARNA. On the ‘structural motif extraction’ menu, users can select RNAmine, which accepts either direct input or uploading of a file of RNA sequences in multi-FASTA format. The server outputs the extracted motifs as abstract figures of the secondary structures and the list of the members by sequence name and positions. Detailed figures and the structure-annotated sequence are linked to the members. For each of the software tools described, after the server outputs the results the user can continue to a homology search of the RNA sequences by BLAT for various genomes. Hits of the search are equipped with links to UCSC GenomeBrowser for functional RNAs (24).

FUTURE PLANS

In addition to web servers, web services for sequence analysis tools are desirable. We have already developed soap-based web services for Murlet, MXSCARNA, PHMMTS and PSTAG. The services will start shortly.

CONCLUSION

We have developed web servers for analysis of RNA sequences in light of their secondary structures. The web server offers six software tools for multiple sequence alignment, pairwise alignment and extraction of structural motifs of RNA sequences. These servers provide practical speed of services for the tasks that have been thought to require high computational costs.
  24 in total

1.  PseudoBase: a database with RNA pseudoknots.

Authors:  F H van Batenburg; A P Gultyaev; C W Pleij; J Ng; J Oliehoek
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Marginalized kernels for RNA sequence data analysis.

Authors:  Taishin Kin; Koji Tsuda; Kiyoshi Asai
Journal:  Genome Inform       Date:  2002

3.  Rfam: an RNA family database.

Authors:  Sam Griffiths-Jones; Alex Bateman; Mhairi Marshall; Ajay Khanna; Sean R Eddy
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

4.  Vienna RNA secondary structure server.

Authors:  Ivo L Hofacker
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

5.  Pair hidden Markov models on tree structures.

Authors:  Yasubumi Sakakibara
Journal:  Bioinformatics       Date:  2003       Impact factor: 6.937

6.  Alignment of RNA base pairing probability matrices.

Authors:  Ivo L Hofacker; Stephan H F Bernhart; Peter F Stadler
Journal:  Bioinformatics       Date:  2004-04-08       Impact factor: 6.937

7.  The equilibrium partition function and base pair binding probabilities for RNA secondary structure.

Authors:  J S McCaskill
Journal:  Biopolymers       Date:  1990 May-Jun       Impact factor: 2.505

8.  miRRim: a novel system to find conserved miRNAs with high sensitivity and specificity.

Authors:  Goro Terai; Takashi Komori; Kiyoshi Asai; Taishin Kin
Journal:  RNA       Date:  2007-10-24       Impact factor: 4.942

9.  Stem kernels for RNA sequence analyses.

Authors:  Yasubumi Sakakibara; Kris Popendorf; Nana Ogawa; Kiyoshi Asai; Kengo Sato
Journal:  J Bioinform Comput Biol       Date:  2007-10       Impact factor: 1.122

10.  A fast structural multiple alignment method for long RNA sequences.

Authors:  Yasuo Tabei; Hisanori Kiryu; Taishin Kin; Kiyoshi Asai
Journal:  BMC Bioinformatics       Date:  2008-01-23       Impact factor: 3.169

View more
  2 in total

1.  Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families.

Authors:  Lars Barquist; Sarah W Burge; Paul P Gardner
Journal:  Curr Protoc Bioinformatics       Date:  2016-06-20

2.  Multi-objective optimization for RNA design with multiple target secondary structures.

Authors:  Akito Taneda
Journal:  BMC Bioinformatics       Date:  2015-09-03       Impact factor: 3.169

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.