| Literature DB >> 24311564 |
Tomás Di Domenico1, Emilio Potenza, Ian Walsh, R Gonzalo Parra, Manuel Giollo, Giovanni Minervini, Damiano Piovesan, Awais Ihsan, Carlo Ferrari, Andrey V Kajava, Silvio C E Tosatto.
Abstract
RepeatsDB (http://repeatsdb.bio.unipd.it/) is a database of annotated tandem repeat protein structures. Tandem repeats pose a difficult problem for the analysis of protein structures, as the underlying sequence can be highly degenerate. Several repeat types haven been studied over the years, but their annotation was done in a case-by-case basis, thus making large-scale analysis difficult. We developed RepeatsDB to fill this gap. Using state-of-the-art repeat detection methods and manual curation, we systematically annotated the Protein Data Bank, predicting 10,745 repeat structures. In all, 2797 structures were classified according to a recently proposed classification schema, which was expanded to accommodate new findings. In addition, detailed annotations were performed in a subset of 321 proteins. These annotations feature information on start and end positions for the repeat regions and units. RepeatsDB is an ongoing effort to systematically classify and annotate structural protein repeats in a consistent way. It provides users with the possibility to access and download high-quality datasets either interactively or programmatically through web services.Entities:
Mesh:
Year: 2013 PMID: 24311564 PMCID: PMC3964956 DOI: 10.1093/nar/gkt1175
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Screenshot of a sample RepeatsDB entry results page (PDB entry 1ikn). The sequence viewer and the structure viewer are shown in the middle of the page, towards the left and the right, respectively. Additional annotations at the structure and chain level are displayed, including links to other databases (above) and classifications (below).
Statistics for RepeatsDB
| Subclass | Name | Detailed | Classified (manually) | Classified (by similarity) | Predicted |
|---|---|---|---|---|---|
| I.1 | Poly-alanine β structure | 0 | 0 | 0 | 0 |
| II.1 | Collagen triple-helix | 0 | 5 | 0 | 0 |
| II.2 | α helical coiled coil | 23 | 38 | 69 | 0 |
| III.1 | β-solenoid | 43 | 113 | 21 | 0 |
| III.2 | α/β solenoid | 21 | 43 | 27 | 0 |
| III.3 | α-solenoid | 48 | 246 | 631 | 0 |
| III.4 | Trimer of β spirals | 7 | 0 | 13 | 0 |
| III.5 | Single layer anti-parallel β | 4 | 3 | 0 | 0 |
| IV.1 | TIM-barrel | 84 | 118 | 626 | 0 |
| IV.2 | β-barrel | 8 | 1 | 8 | 0 |
| IV.3 | β-trefoil | 20 | 0 | 29 | 0 |
| IV.4 | β-propeller | 40 | 182 | 227 | 0 |
| IV.5 | α/β prism | 0 | 17 | 0 | 0 |
| IV.6 | α-barrel | 6 | 0 | 0 | 0 |
| V.1 | α-beads | 2 | 1 | 0 | 0 |
| V.2 | β-beads | 29 | 12 | 71 | 0 |
| V.3 | α/β-beads | 3 | 3 | 1 | 0 |
| V.other | Unknown subclass | 3 | 0 | 4 | 0 |
| UA | Unassigned | 0 | 0 | 0 | 7948 |
| Total | 321 | 749 | 1727 | 7948 |
The subclass name is shown together with the number of entries on each of the four annotation levels. Note that ‘Unassigned’ entries are automatically predicted by RAPHAEL and therefore not assigned to a specific class.