| Literature DB >> 16845015 |
Darby Tien-Hao Chang1, Yi-Zhong Weng, Jung-Hsin Lin, Ming-Jing Hwang, Yen-Jen Oyang.
Abstract
UNLABELLED: Geometrical analysis of protein tertiary substructures has been an effective approach employed to predict protein binding sites. This article presents the Protemot web server that carries out prediction of protein binding sites based on the structural templates automatically extracted from the crystal structures of protein-ligand complexes in the PDB (Protein Data Bank). The automatic extraction mechanism is essential for creating and maintaining a comprehensive template library that timely accommodates to the new release of PDB as the number of entries continues to grow rapidly. The design of Protemot is also distinctive by the mechanism employed to expedite the analysis process that matches the tertiary substructures on the contour of the query protein with the templates in the library. This expediting mechanism is essential for providing reasonable response time to the user as the number of entries in the template library continues to grow rapidly due to rapid growth of the number of entries in PDB. This article also reports the experiments conducted to evaluate the prediction power delivered by the Protemot web server. Experimental results show that Protemot can deliver a superior prediction power than a web server based on a manually curated template library with insufficient quantity of entries. AVAILABILITY: http://protemot.csie.ntu.edu.tw/step1.cgi http://bioinfo.mc.ntu.edu.tw/protemot/step1.cgi.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16845015 PMCID: PMC1538868 DOI: 10.1093/nar/gkl344
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Workflow of the analysis procedure incorporated in Protemot.
Figure 2Pseudo-code of the refinement process.
A statistics of structural similarity between the templates in the Protemot library and those in the CSA library
| Number of templates in the CSA library for which a match in the following categories is found in the Protemot library | |
|---|---|
| Highly probable | 73 |
| Probable | 17 |
| Possible | 15 |
| Unlikely | 42 |
| Total number of templates in the CSA library | 147 |
Speed up achieved with the refinement process
| PDB ID of the query protein | Execution time (in seconds) without the refinement process | Execution time (in seconds) with the refinement process | Speed up |
|---|---|---|---|
| 1BCK | 4520 | 441 | 10.25 |
| 1A46 | 9312 | 733 | 12.70 |
| 1AAW | 14000 | 1064 | 13.16 |
| 1TRN | 14788 | 1138 | 12.99 |
| 2HGS | 18240 | 1330 | 13.71 |
Figure 3The query page of the Protemot web server.
Figure 4An example output of the Protemot web server, in which yellow balls are the residues in the query protein (1BCK) that matches the template; pink balls are the residues of the template.
Comparison of how Protemot and the CSA-based web server perform based on the fourth-level EC codes
| CSA (highly probable + probable + possible) | CSA (highly probable + probable) | Protemot | Overlap between Protemot and CSA (highly probable + probable) | |
|---|---|---|---|---|
| Number of testing enzymes | 1000 | 1000 | 1000 | |
| The template library contains a template that is extracted from a protein–ligand complex structure with the same fourth-level EC code as the query enzyme and the web server makes a correct prediction. | 81 | 75 | 408 | 44 |
| The template library contains a template that is extracted from a protein–ligand complex structure with the same fourth-level EC code as the query enzyme but the web server makes an incorrect prediction. | 61 | 8 | 310 | 0 |
| The template library contains a template that is extracted from a protein–ligand complex structure with the same fourth-level EC code as the query enzyme but the web server makes no prediction. | 4 | 63 | 14 | 1 |
| The template library does not contain a template that is extracted from a protein–ligand complex structure with the same fourth-level EC code as the query enzyme and the web server makes no prediction. | 65 | 777 | 14 | 13 |
| The template library does not contain a template that is extracted from a protein–ligand complex structure with the same fourth-level EC code as the query enzyme but the web server makes a prediction, which is certainly incorrect. | 789 | 77 | 254 | 28 |
Comparison of how Protemot and the CSA-based web server perform based on the third-level EC codes
| CSA (highly probable + probable + possible) | CSA (highly probable + probable) | Protemot | Overlap between Protemot and CSA (highly probable + probable) | |
|---|---|---|---|---|
| Number of testing enzymes | 1000 | 1000 | 1000 | |
| The template library contains a template that is extracted from a protein–ligand complex structure with the same third-level EC code as the query enzyme and the web server makes a correct prediction. | 143 | 118 | 514 | 80 |
| The template library contains a template that is extracted from a protein–ligand complex structure with the same third-level EC code as the query enzyme but the web server makes an incorrect prediction. | 531 | 28 | 447 | 11 |
| The template library contains a template that is extracted from a protein–ligand complex structure with the same third-level EC code as the query enzyme but the web server makes no prediction. | 47 | 575 | 26 | 14 |
| The template library does not contain a template that is extracted from a protein–ligand complex structure with the same third-level EC code as the query enzyme and the web server makes no prediction. | 22 | 265 | 2 | 2 |
| The template library does not contain a template that is extracted from a protein–ligand complex structure with the same third-level EC code as the query enzyme but the web server makes a prediction, which is certainly incorrect. | 257 | 14 | 11 | 1 |