| Literature DB >> 23125362 |
Sarah W Burge1, Jennifer Daub, Ruth Eberhardt, John Tate, Lars Barquist, Eric P Nawrocki, Sean R Eddy, Paul P Gardner, Alex Bateman.
Abstract
The Rfam database (available via the website at http://rfam.sanger.ac.uk and through our mirror at http://rfam.janelia.org) is a collection of non-coding RNA families, primarily RNAs with a conserved RNA secondary structure, including both RNA genes and mRNA cis-regulatory elements. Each family is represented by a multiple sequence alignment, predicted secondary structure and covariance model. Here we discuss updates to the database in the latest release, Rfam 11.0, including the introduction of genome-based alignments for large families, the introduction of the Rfam Biomart as well as other user interface improvements. Rfam is available under the Creative Commons Zero license.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23125362 PMCID: PMC3531072 DOI: 10.1093/nar/gks1005
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Growth of Rfam families and sequence regions annotated per each Rfam release. Dates are in MM/YY format.
Figure 2.(a) Rfam types organized according to their coverage of sequence space. Size of rectangles is proportional to the number of regions annotated by families of that Rfam type; colour of rectangles is proportional to the number of families of that type. (b) Taxonomic coverage of Rfam families. Families have been categorized according to the taxonomic groups covered by the seed sequences, and families in clans are treated as belonging to a single family. This analysis omits six families where the seed contains only unclassified sequences.
Comparison of alignment sizes for the four largest Rfam families between Rfam 10.1, full alignments and genome-based alignments
| Family | Description | Number of sequences in full, Rfam 10.1 | Number of sequences in full, Rfam 11.0 | Number of sequences in genome-based alignment |
|---|---|---|---|---|
| RF00005 | tRNA | 1 101 833 | 2 106 268 | 298 470 |
| RF00177 | Bacterial small subunit ribosomal RNA | 343 886 | 744 528 | 7429 |
| RF01959 | Archaeal small subunit ribosomal RNA | 9072 | 881 056 | 7394 |
| RF01960 | Eukaryotic small subunit ribosomal RNA | 45 117 | 65 901 | 425 |
Editing statistics for Wikipedia articles linked to by Pfam and Rfam classified by editor type. Character values rounded to nearest 1000
| Type | No. of edits | No. of editors | Total characters added (1000s) | Total characters subtracted (1000s) | Average characters added per Editor (1000s) | Average characters subtracted per Editor (1000s) |
|---|---|---|---|---|---|---|
| Xfam | 2604 | 12 | 2166 | 321 | 181 | 27 |
| Scientist | 955 | 92 | 2645 | 730 | 29 | 8 |
| Wikipedian | 2570 | 85 | 4033 | 2506 | 47 | 30 |
| Bot | 2783 | 78 | 730 | 337 | 9 | 4 |
| Long tail | 6636 | 3454 | 1767 | 402 | 1 | 0.1 |
| Total | 15 548 | 3721 | 11 340 | 4296 | 267 | 69 |
Figure 3.Sunburst visualization of family taxonomy for RF01051, the cyclic di-GMP-I riboswitch. Users may select regions of taxonomic space and use the controls on the left to download an alignment of their chosen subset of species.