| Literature DB >> 27188396 |
Ladislav Rampášek1,2,3, Randi M Jimenez2, Andrej Lupták4, Tomáš Vinař3, Broňa Brejová5.
Abstract
BACKGROUND: In this paper, we study the problem of RNA motif search in long genomic sequences. This approach uses a combination of sequence and structure constraints to uncover new distant homologs of known functional RNAs. The problem is NP-hard and is traditionally solved by backtracking algorithms.Entities:
Keywords: Entropy; Pseudoknot; RNA motif search; Search order
Mesh:
Substances:
Year: 2016 PMID: 27188396 PMCID: PMC4870747 DOI: 10.1186/s12859-016-1074-x
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1An illustration of an ATP aptamer motif and its corresponding descriptor based on genomic adenosine aptamers [21]. Nucleotide constraints for individual positions are expressed in the IUPAC notation [51]
Fig. 2An illustration of the RNArobo search procedure for the motif of ATP aptamer. The search follows the order of elements s1, s3, h2, h1, s2
Fig. 3Computation of the search domain for a single-stranded element s3. Here we have a partly matched motif composed of five single-stranded elements s1,…,s5. Assume that elements s1 and s5 have already been matched. The match of s3 has to start in the left green interval and end in the right green interval, and it has to completely cover the red interval in the middle
Fig. 4Descriptor for HDV-like ribozyme with structured P4 region. The motif contains four paired elements organized in a double pseudoknot
Summary of reproducing results from the literature using RNArobo. Searches were extended by allowing for insertions in structurally conserved elements that are known to tolerate single base insertions. This extension led to improved sensitivity and yielded several new putatively functional ribozymes
|
|
|
|
|
|
|---|---|---|---|---|
| In vitro selected library | GTP apt. class I | 9 | yes | |
| (∼65 kBp scanned) | GTP apt. class I w/ ins | 10 | yes [ | novel hit |
|
| HHR I (4 bp) | 1 | Yli-1-3 | |
| (∼41 MBp scanned) | HHR I (4 bp) w/ ins | 15 | Yli-1-3 through Yli-1-11 | |
| HHR I (3 bp) | 4 | Yli-1-3 and Yli-1-13 | ||
| HHR I (3 bp) w/ ins | 54 | Yli-1-3 through Yli-1-11, | novel family | |
| and Yli-1-13 [ | (10 hits) | |||
|
| HHR II | 1 | Bce-1-1 | |
| (∼11 MBp scanned) | HHR II w/ ins | 4 | Bce-1-1 [ | |
|
| HDV (loose P4) | 7 | Agam-1-1 | |
| (∼98 MBp bases scanned) | HDV (loose P4) w/ ins | 36 | Agam-1-1 and Agam-1-2 [ | |
|
| HDV (stem P4) | 11 | yes | |
| (∼2.1 GBp scanned) | HDV (stem P4) w/ ins | 11 | yes | |
| HDV (loose P4) + FF | 15 | yes | ||
| HDV (loose P4) w/ ins + FF | 16 | yes [ | novel hit |
FF: only hits passing the Fold-Filter are reported
The running times (in seconds) of different programs searching for various descriptors in the whole human genome
| ATP apt. | GTP apt. | generalized | HHR-I | HHR-II | HHR | HDV | HDV | HDV | |
|---|---|---|---|---|---|---|---|---|---|
| class I | tRNA | (4 bp) | (3 bp) | extended | (loose P4) | (stem P4) | (mispairs) | ||
| RNAbob | 3,419.03 | 2,744.30 | 7,450.07 | 923.53 | 5,027.33 | 2,269.35 | 209,932.57 | 43,430.78 | 36,459.87 |
| RNAmotif |
| 222.54 | 7,374.32 |
| 265.09 | 116.15 | 26,259.79 | 2,513.90 | 9,240.83 |
| RNAMot | unf | unf | unf | unf | unf | unf | unf | 4,538.92 | 8,925.31 |
| RNArobo | 80.91 |
|
| 96.48 |
|
|
|
|
|
| RNArobo-ins | – | 153.38 | – | 98.22 | 137.47 | – | 173.82 | 171.54 | – |
Experiments were run on Intel Xeon E5520 CPU. RNArobo-ins is RNArobo run with modified descriptors allowing insertions in helical elements. RNAMot did not finish on most of the inputs within time limit of three days. Only results that finished within three days are shown. Since DDEO is randomized, we show the average running time of five runs of RNArobo. Standard deviation was up to 3 % or 5 sec, with the exception of the HHR extended descriptor, where the running time ranged from 98 to 125 sec
Boldface numbers represent the best running times for a particular descriptor