| Literature DB >> 18515349 |
Marcin von Grotthuss1, Dariusz Plewczynski, Gert Vriend, Leszek Rychlewski.
Abstract
The 'omics' revolution is causing a flurry of data that all needs to be annotated for it to become useful. Sequences of proteins of unknown function can be annotated with a putative function by comparing them with proteins of known function. This form of annotation is typically performed with BLAST or similar software. Structural genomics is nowadays also bringing us three dimensional structures of proteins with unknown function. We present here software that can be used when sequence comparisons fail to determine the function of a protein with known structure but unknown function. The software, called 3D-Fun, is implemented as a server that runs at several European institutes and is freely available for everybody at all these sites. The 3D-Fun servers accept protein coordinates in the standard PDB format and compare them with all known protein structures by 3D structural superposition using the 3D-Hit software. If structural hits are found with proteins with known function, these are listed together with their function and some vital comparison statistics. This is conceptually very similar in 3D to what BLAST does in 1D. Additionally, the superposition results are displayed using interactive graphics facilities. Currently, the 3D-Fun system only predicts enzyme function but an expanded version with Gene Ontology predictions will be available soon. The server can be accessed at http://3dfun.bioinfo.pl/ or at http://3dfun.cmbi.ru.nl/.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18515349 PMCID: PMC2447717 DOI: 10.1093/nar/gkn308
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
A few of the many structural genomics efforts
| Consortium | www-page | No. of deposited PDP files |
|---|---|---|
| SGC | 500 | |
| NESC | 500 | |
| TBSGC | 500 | |
| SGXRC | 500 | |
| BSGC | 100 | |
| CESG | 100 | |
| JCSG | 600 | |
| MCSG | 750 | |
| YSG | 25 | |
| RSGI | 2000 | |
| SGPP | 50 | |
| SECSG | 75 | |
| PSF | 20 | |
| SPINE | 100 |
The three columns give the name of the consortium, their WWW-page and their stated number of deposited PDB files, respectively. Note that collaborations between centers in consortia may have caused double counting of PDB entries.
Figure 1.A screenshot of the 3D-Hit result list. In the top panel, the user selects the sort type and the maximum number of hits to be displayed. The backbone superposition of the query structure and the database hits selected by the user are displayed with Jmol (13). The residues are colored from N to C from red to blue. The database hits are shown below the Jmol viewer. If the superposition score is above the false-positive cutoff score, then the corresponding EC number is listed in blue color. Further details are explained at the server help-page.
Figure 2.ROC curves for the 1st EC level (upper left chart); 1st and 2nd EC level (upper right chart); 1st, 2nd and 3rd EC level (lower left chart) and for all four EC levels (lower right chart). Note that the ROC curves for the random case (shown in black) are not diagonal lines as is usual in ROC plots. This is a consequence of the fact that prediction of enzyme function is a more difficult problem than bimodal classifications. Clearly, the probability of assigning an incorrect EC number in the random test is much bigger than assigning a correct one.
Five examples of functional annotations made using 3D-Fun
| No. | PDB accession codes | Predicted EC number | Predicted enzyme function |
|---|---|---|---|
| 1. | 2QMM, 2QWV | 2.1.1.– | Methyltransferase |
| 2. | 1ZEE | 1.3.11.– | Indoleamine 2,3-dioxygenase |
| 3. | 2G7Z | 2.7.1.– | Phosphotransferase with an alcohol group as acceptor |
| 4. | 3BBJ | 3.1. 2.– | Thioester hydrolase |
| 5. | 1YS9 | 3.1.3.– | Hydrolase |
The predictions 1 and 2 are explained in greater detail in the text.
Figure 3.Superposition of a query structure 1ZEE and a protein structure of indoleamine 2,3-dioxygenase function with the 2D0T PDB code. These two proteins share only 17% sequence identity but over 66% of their C-α atoms are aligned and within 3 Å. The chains are colored from blue (N-termini) to red (C-termini).