| Literature DB >> 30500919 |
John-Marc Chandonia1,2, Naomi K Fox1,2, Steven E Brenner1,3.
Abstract
The SCOPe (Structural Classification of Proteins-extended, https://scop.berkeley.edu) database hierarchically classifies domains from the majority of proteins of known structure according to their structural and evolutionary relationships. SCOPe also incorporates and updates the ASTRAL compendium, which provides multiple databases and tools to aid in the analysis of the sequences and structures of proteins classified in SCOPe. Protein structures are classified using a combination of manual curation and highly precise automated methods. In the current release of SCOPe, 2.07, we have focused our manual curation efforts on larger protein structures, including the spliceosome, proteasome and RNA polymerase I, as well as many other Pfam families that had not previously been classified. Domains from these large protein complexes are distinctive in several ways: novel non-globular folds are more common, and domains from previously observed protein families often have N- or C-terminal extensions that were disordered or not present in previous structures. The current monthly release update, SCOPe 2.07-2018-10-18, classifies 90 992 PDB entries (about two thirds of PDB entries).Entities:
Mesh:
Substances:
Year: 2019 PMID: 30500919 PMCID: PMC6323910 DOI: 10.1093/nar/gky1134
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
SCOP growth
| Release | Freeze date | Release date | Months to release | Total PDB entries | Total PDB entries classified | Total domains classified |
|---|---|---|---|---|---|---|
| SCOP 1.55 | 2001–03 | 2001–07 | 4 | 13 307 | 13 228 | 31 474 |
| SCOP 1.57 | 2001–10 | 2002–01 | 3 | 14 833 | 14 736 | 35 755 |
| SCOP 1.59 | 2002–03 | 2002–05 | 2 | 16 067 | 15 985 | 39 893 |
| SCOP 1.61 | 2002–09 | 2002–11 | 2 | 17 510 | 17 411 | 44 327 |
| SCOP 1.63 | 2003–03 | 2003–06 | 3 | 19 049 | 18 951 | 49 497 |
| SCOP 1.65 | 2003–08 | 2003–12 | 4 | 20 715 | 20 619 | 54 745 |
| SCOP 1.67 | 2004–05 | 2005–02 | 9 | 24 151 | 24 036 | 65 122 |
| SCOP 1.69 | 2004–10 | 2005–07 | 9 | 26 124 | 25 972 | 70 859 |
| SCOP 1.71 | 2005–01 | 2006–10 | 21 | 27 844 | 27 599 | 75 930 |
| SCOP 1.73 | 2007–09 | 2007–11 | 2 | 44 156 | 34 494 | 97 178 |
| SCOP 1.75 | 2009–02 | 2009–06 | 4 | 53 832 | 38 221 | 110 800 |
| SCOPe 2.01 | 2012–02 | 2012–03 | 1 | 76 312 | 49 219 | 135 634 |
| SCOPe 2.02 | 2012–11 | 2013–01 | 2 | 83 296 | 49 560 | 136 313 |
| SCOPe 2.03 | 2013–08 | 2013–10 | 2 | 90 354 | 59 514 | 167 547 |
| SCOPe 2.04 | 2014–04 | 2014–07 | 3 | 96 087 | 67 580 | 192 710 |
| SCOPe 2.05 | 2014–12 | 2015–02 | 3 | 102 263 | 71 015 | 203 026 |
| SCOPe 2.06 | 2016–01 | 2016–02 | 1 | 113 035 | 77 439 | 244 326 |
| SCOPe 2.07 | 2017–12 | 2018–03 | 3 | 133 747 | 87 224 | 276 231 |
The number of entries and domains in each release of SCOP that used stable identifiers. For each release, the ‘freeze date,’ or date after which no new PDB entries were to be classified in the release, is given. In practice, some entries released just after the freeze date were sometimes included. The total number of PDB entries that contained protein structures, were not obsolete as of the freeze date, or which were included in each release, is given, as well as the number of PDB entries that were included in each release and the number of domains in these entries. These counts differ slightly from the counts in (6) due to corrections to the dates on which some entries became obsolete. Release 1.71 was the most recent comprehensive SCOP release (i.e. one in which nearly all PDB entries available prior to the freeze date were classified).
Figure 1.(A) The spliceosome complex 3jb9 is oriented with the two largest subunits at the top. Prp8 is in blue. Cwf10 is in purple. Note two interactions between the domains: on the right side of the figure, a loop of Prp8 (blue) forms a ‘lasso’ around part of Cwf10. Toward the bottom of the area between the two domains, a long extended region of Cwf10 (purple) interacts with Prp8 and other nearby subunits. Axes (x, y and z colored red, green and blue, respectively) indicate the orientation of (B) and (D) relative to (A). (B) Spliceosome component Prp8 contains five domains, four of which are homologous to domains previously seen in other proteins. The large, mainly α-helical N-terminal domain shown in dark blue has never before been structurally characterized. It includes the ‘lasso’ region (bottom left) that binds Cwf10, as well as an extended helix (bottom right) that binds to another subunit, Cwf14. The second domain is a bromodomain shown in green. The third domain (pink) is homologous to retroviral reverse transcriptase domains. The fourth domain (orange) is homologous to restriction endonucleases. The latter three domains had been predicted bioinformatically (33), and are strongly supported by the structure; all three represented new families in SCOPe, in existing superfamilies. The fifth domain (red) is ribonuclease H-like, as previously classified in SCOP. (C) The extended Prp45 fragment from the yeast spliceosome structure comprises about half of the full-length protein, and binds to at least nine other subunits (25). The observed parts of the structure span a distance of over 150 Å. (D) The first domain in spliceosome component Cwf10 (compact portion, left) belongs to a previously structurally characterized family of proteins that includes Elongation Factor 2 (EF-2). A typical structure of EF-2 (d3b8he1, right) is shown for comparison in the same orientation. The extended region of Cwf10 is stabilized by interactions with Prp8. (E) RNA Polymerase I. A close-up view of the interactions between the A14 protein and the A43 protein in the RNA Polymerase I stalk is shown. A14 is shown in pink, and A43 is in orange. A14 forms a novel non-globular fold consisting of an α hairpin and several β strands involved in heteromeric binding with A43. (F) The 26S proteasome complex 4cr2 is oriented with the regulatory Proteasome-COP9-Initiation factor 3 (PCI) ring on the front left. The horseshoe-shaped ring comprises 6 homologous subunits, Rpn3/5/6/7/9/12, shown in pink/orange/yellow/green/blue/purple. The C-terminal helix of each domain is colored red. (G) One of the PCI subunits, Rpn12, is shown on the left. It contains two domains, a TPR-like superhelical domain on the left side of the structure and a ‘winged helix’ domain, top right of the structure. The C-terminal helix (circled in red) is part of the ‘winged helix’ domain. A previously characterized homolog, eIF3k, is shown on the right. eIF3k contains the same two domains, but the C-terminal helix (circled in red) is broken and packed against the N-terminal domain.