| Literature DB >> 17760982 |
Jean-Christophe Nebel1, Pawel Herzyk, David R Gilbert.
Abstract
BACKGROUND: Since many of the new protein structures delivered by high-throughput processes do not have any known function, there is a need for structure-based prediction of protein function. Protein 3D structures can be clustered according to their fold or secondary structures to produce classes of some functional significance. A recent alternative has been to detect specific 3D motifs which are often associated to active sites. Unfortunately, there are very few known 3D motifs, which are usually the result of a manual process, compared to the number of sequential motifs already known. In this paper, we report a method to automatically generate 3D motifs of protein structure binding sites based on consensus atom positions and evaluate it on a set of adenine based ligands.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17760982 PMCID: PMC1995225 DOI: 10.1186/1471-2105-8-321
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Chemical structures of adenine, ATP and guanine.
Initial clusters for each ligand
| AMP | 75 | 3 | 5-4-3 |
| ADP | 185 | 9 | 10-9-8-7-6-5-5-4-4 |
| ATP | 133 | 6 | 10-9-4-3-3-3 |
Consensus information about each pattern
| 1vom | ADP | 6 | 11 | Myosin | / | 1.10.162.10 | 1237 | 001093 | ||
| 1q0b | ADP | 7 | 7 | Kinesin | / | 3.40.850.10 | 1236 | 001752 | 50067 | |
| 1b62 | ADP | 10 | 8 | 3.30.565.10 | (846, 847, 848) | 003594 | ||||
| 1qf9 | ADP | 8 | 23 | Transferase | 2.7. (2.7.1., 2.7.4.) | 3.40.50.300 | ||||
| 1ehi | ADP | 5 | 18 | Ligase (60%) Transferase (40%) | 3.30. | |||||
| 1njf | ADP | 4 | 17 | 1062 | 003593 | |||||
| 1oxu | ADP | 5 | 6 | Abc transporter | / | 3.40.50.300 | 1069 | 003593 | 00211 | |
| 1v26 | AMP | 4 | 27 | Ligase | 736 | 000873 | 00455 | |||
| 1y8q | ATP | 3 | 71 | Ligase | ND | 000205 | ||||
| 2a40 | ADP | 13 | 42 | Actin (85%) Heat shock (15%) | / | 3.30.420.40 | 1175 | |||
| 1j1c | ADP | 20 | 45 | Kinase | 2.7.1. | 1.10.510.10 | 593 | 000719 | 50011 | |
| 1ses | AMP | 8 | 51 | Ligase | 6.1.1. | 3.30.930.10 | 1116 | 50862 | ||
| 1o97 | AMP | 7 | 42 | 3.40.50.620 | (973 974 975) |
This table consists of the name of the pattern, its PDB representative (Rep.), names of ligands the pattern was created from, the size of the cluster, the number of atoms in the pattern, the annotated function, the EC (Enzyme Commission) number if relevant [35], the protein fold classifications according to CATH [3] and DALI [2], and finally detected sequence motifs according to InterPro (IPR) [11] and PROSITE (PS) [10]. When consensus values could not be found, but close alternatives were available, values are shown between brackets. Finally, when data was not available through PDBSum, it was inferred when possible using homologues; in this case values are shown in italic. Otherwise when not enough data was available to generate a meaningful consensus value, the code 'ND' is used.
3D patterns detected by other systems
| 1vom | ||||
| 1q0b | ||||
| 1oxu | ||||
| 1y8q | ||||
| 2a40 | ||||
| 1ses |
This table consists of the patterns which were identified by other systems able to detect 3D motifs (i.e. CSA, SuMo and PINTS).
Figure 2Alignment of the sequences using structural constraints.
Figure 3Common pattern associated to the ADP4 pattern. Superimposition of atoms from 1ehi, 1e4e, 1gsa, 1kjq & 1iah used for the generation of the pattern. Wireframe shows consensus atoms belonging to the adenine-based ligand. a) CPK colour scheme is used. b) Amino acid/Shapely colour scheme is used.
Figure 4Superimposition of residues associated to the ADP4 pattern. Superimposition of residues (Wireframe representation and Amino acid/Shapely colour scheme) which have atoms belonging to the pattern. a) [FI] [VLI] [K] part of the pattern ([VLI] is not part of the structural pattern). b) [EQ]-X-[ACVY] [MLV] part of the pattern
Total number of PDB entries per ligand and matches against the generated 3D patterns
| PDB entries containing ligand | 128 | 406 | 234 | 125 | 29 | 27 | 199 | 77 |
| Hits against the 13 patterns | 14.8% | 30.5% | 35.5% | 36.8% | 44.8% | 0.0% | 0.5% | 1.3% |
| PDB50 entries in valid clusters | 16.0% | 30.8% | 23.3% | / | / | / | / | / |
Figure 5Composition of Training Sets (TS), True Positives (TP) & False Positives (FP) against all PDB proteins binding either purine based or guanine based ligands. X-axis gives the number of binding sites which are present in each set. Their type, i.e. AMP, ADP, ATP, ANP, ACP, GMP, GDP or GTP binding, is represented by different colours (see legend).
Figure 6Outline of the methodology used for the generation of 3D patterns and protein annotation.