| Literature DB >> 25348404 |
Yang Wang1, Xue-Jia Hu1, Xu-Dong Zou1, Xian-Hui Wu1, Zhi-Qiang Ye2, Yun-Dong Wu3.
Abstract
WD40-repeat proteins, as one of the largest protein families, often serve as platforms to assemble functional complexes through the hotspot residues on their domain surfaces, and thus play vital roles in many biological processes. Consequently, it is highly required for researchers who study WD40 proteins and protein-protein interactions to obtain structural information of WD40 domains. Systematic identification of WD40-repeat proteins, including prediction of their secondary structures, tertiary structures and potential hotspot residues responsible for protein-protein interactions, may constitute a valuable resource upon this request. To achieve this goal, we developed a specialized database WDSPdb (http://wu.scbb.pkusz.edu.cn/wdsp/) to provide these details of WD40-repeat proteins based on our recently published method WDSP. The WDSPdb contains 63,211 WD40-repeat proteins identified from 3383 species, including most well-known model organisms. To better serve the community, we implemented a user-friendly interactive web interface to browse, search and download the secondary structures, 3D structure models and potential hotspot residues provided by WDSPdb.Entities:
Mesh:
Year: 2014 PMID: 25348404 PMCID: PMC4383882 DOI: 10.1093/nar/gku1023
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Statistics of WDSPdb. The numbers of identified WD40 proteins (with ≥6 repeats), WD40 domains, WD40 repeats, and potential hotspots in total, different taxa and several model organisms
| Category | WD40 proteins | WD40 domains | WD40 repeats | Potential hotspots | Species |
|---|---|---|---|---|---|
| Total | 63 211 | 71 480 | 489 411 | 726 685 | 3383 |
| Eukaryota | 58 284 | 65 311 | 447 323 | 662 833 | 860 |
| Bacteria | 4832 | 6065 | 41 358 | 62 704 | 2476 |
| Archaea | 50 | 59 | 419 | 637 | 34 |
| Virus | 45 | 45 | 311 | 511 | 13 |
| Homo sapiens | 610 | 708 | 4837 | 7084 | |
| Mus musculus | 562 | 659 | 4508 | 6530 | |
| Danio rerio | 407 | 467 | 3242 | 4788 | |
| Drosophila melanogaster | 299 | 319 | 2193 | 3159 | |
| Caenorhabditis elegans | 142 | 157 | 1076 | 1562 | |
| Arabidopsis thaliana | 358 | 384 | 2635 | 3866 | |
| Oryza sativa | 16 | 18 | 123 | 178 | |
| Saccharomyces cerevisiae | 83 | 92 | 635 | 969 | |
| Schizosaccharomyces pombe | 104 | 115 | 787 | 1,147 |
Figure 1.The framework of WDSPdb.
Figure 2.(A) The secondary structure table provided by the output of the WDSP. Each row represents a WD40 repeat sequence. Secondary structure markers are colored in the table heading. Residues shown in blue in each repeat form family-conserved DHSW tetrad hydrogen bond networks for structure stabilization. Residues shown in red are hotspot residues predicted to be responsible for PPI. (B) The structure of DHSW tetrad hydrogen bonds network. (C) The interactive interface implemented by Jsmol applet for viewing and manipulating the 3D structure. When clicking on the potential hotspot residues listed in the table, they will display as sticks with red labels.
Figure 3.The result page for each identified WD40-repeat protein, which comprises general annotation from UniProt database, JSmol applet presenting the predicted 3D structure and the detailed secondary structure table.
Comparison of WD40 proteins among WDSPdb, SMART, Pfam, Prosite, UniProt databases and the union set of SMART+Pfam+Prosite+UniProt database
| Protein (repeat≥1) | 1Protein (repeat≥6) | 2Mutil-WD40-domain proteins | |
|---|---|---|---|
| WDSPdb | 99 262 | 63 211 | 7444 |
| SMART | 83 877 | 39 378 | 6511 |
| Pfam | 73 298 | 15 018 | 2256 |
| Prosite | 68 376 | 9750 | 1749 |
| UniProt | 3196 | 2033 | 198 |
| SMART+Pfam+Prosite+UniProt | 84 912 | 39 883 | 6610 |
1Only proteins with at least six WD40 repeats are stored in WDSPdb, since a WD40 domain requires at least six WD40 repeats to form a complete structure.
2Proteins with more than eight WD40 repeats.