| Literature DB >> 16689703 |
Axel Mosig1, Katrin Sameith, Peter Stadler.
Abstract
Many classes of non-coding RNAs (ncRNAs; including Y RNAs, vault RNAs, RNase P RNAs, and MRP RNAs, as well as a novel class recently discovered in Dictyostelium discoideum) can be characterized by a pattern of short but well-conserved sequence elements that are separated by poorly conserved regions of sometimes highly variable lengths. Local alignment algorithms such as BLAST are therefore ill-suited for the discovery of new homologs of such ncRNAs in genomic sequences. The Fragrep tool instead implements an efficient algorithm for detecting the pattern fragments that occur in a given order. For each pattern fragment, the mismatch tolerance and bounds on the length of the intervening sequences can be specified separately. Furthermore, matches can be ranked by a statistically well-motivated scoring scheme.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16689703 PMCID: PMC5054030 DOI: 10.1016/S1672-0229(06)60017-X
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Fig. 1The type-I ncRNAs from Dictyostelium discoideum. Top Left: The phylogenetic tree (neighborjoining method) suggests that there are two major subgroups, labeled as A and B. Leaf labels refer to the positions of the corresponding occurrences within the genome; for instance, X4a-5 refers to the fifth member within cluster a in Chromosome 4 (see the middle part of the figure); plus (+) or minus (−) indicates the occurrences in the 5′ or 3′ direction. Top Right: The type-I ncRNAs that appear in clusters on all chromosomes. The clusters are labeled by lower case letters, and the italic numbers below the clusters indicate the DdR- numbers of the expressed RNAs from the experimental survey by Aspegren et al. (. Bottom: The organization of the two largest clusters a and b located at Chromosome 4. Note that type A and type B copies alternate. The other type-I ncRNA clusters consist of no more than three sequences.
Surveys of Mammalian Genomes for vault RNA Candidates
| Genome | Size (Mb) | Runtime (mm:ss) | No. of matches |
|---|---|---|---|
| 2,980 | 9:24 | 14 | |
| 2,561 | 7:36 | 35 | |
| 2,640 | 8:33 | 44 | |
| 2,454 | 7:55 | 768 |
| 0 | 0 | GTTGRCCTTACAGCAA | 2 |
| 0 | 120 | GTCAACTG | 2 |
| 0 | 0 | TRGCNNAGYGG | 1 |
| 0 | 100 | GGTTCGANTCC | 1 |
| 0 | 100 | GGTTCGANTCC | 1 |