| Literature DB >> 25940619 |
Matan Drory Retwitzer1, Maya Polishchuk2, Elena Churkin1, Ilona Kifer3, Zohar Yakhini4, Danny Barash5.
Abstract
Searching for RNA sequence-structure patterns is becoming an essential tool for RNA practitioners. Novel discoveries of regulatory non-coding RNAs in targeted organisms and the motivation to find them across a wide range of organisms have prompted the use of computational RNA pattern matching as an enhancement to sequence similarity. State-of-the-art programs differ by the flexibility of patterns allowed as queries and by their simplicity of use. In particular-no existing method is available as a user-friendly web server. A general program that searches for RNA sequence-structure patterns is RNA Structator. However, it is not available as a web server and does not provide the option to allow flexible gap pattern representation with an upper bound of the gap length being specified at any position in the sequence. Here, we introduce RNAPattMatch, a web-based application that is user friendly and makes sequence/structure RNA queries accessible to practitioners of various background and proficiency. It also extends RNA Structator and allows a more flexible variable gaps representation, in addition to analysis of results using energy minimization methods. RNAPattMatch service is available at http://www.cs.bgu.ac.il/rnapattmatch. A standalone version of the search tool is also available to download at the site.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25940619 PMCID: PMC4489251 DOI: 10.1093/nar/gkv435
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The input screen of the RNAPattMatch web server with the example sequence-structure pattern query taken from (19), a target file of 2.6MB containing the complete genome of Thermoanaerobacteria tengcongensis MB4 and example parameters inserted. The target file (up to 100 MB) may contain several genome sequences in FASTA format.
Figure 2.The results screen of the RNAPattMatch web server, matches are found in a table with options to sort and filter by selected parameters. Each row provides a match result by specifying its start index in the target, the corresponding sequence from the target and its aligned structure including the gaps used for the specific match, the amount of gaps used for the match, the matrix cost, the energy of the given match and an option to view the predicted secondary structure drawing while comparing it with the secondary structure drawing of the match itself. There is an option to create an additional query on the same file. If the caching of the data structure is completed it will be used.
Figure 3.The predicted secondary structure of the first found match using Mfold (9) for the secondary structure drawing, while comparing it with the secondary structure drawing of the match itself. Secondary structures are represented by Vienna's dot-bracket representation (10) and the coarse-grained Shapiro representation (30). As expected from the pattern query, the structure drawings correspond to the purine riboswitch aptamer described in (19).
Running times for the example queries on selected target files
| Query | Human chromosome 16 hg38, GRCh38 (89mb) | ||
|---|---|---|---|
| Guanine-binding riboswitch aptamera | 2(s)b | 6(s) | 127(s) |
| Hairpin with G-C stemc | 7(s) | 29(s) | 146(s) |
aMatches for guanine riboswitch were not found in the non-bacterial organisms reported in the table.
bRunning times were taken from the RNAPattMatch web server on non-cached targets and do not include file upload time.
cDifficulty of the query pattern is dictated by the amount of hairpins and the size of their search space as observed in (25).
Query sequences are available as examples in the web server.