| Literature DB >> 16995956 |
Bingding Huang1, Michael Schroeder.
Abstract
BACKGROUND: Identifying pockets on protein surfaces is of great importance for many structure-based drug design applications and protein-ligand docking algorithms. Over the last ten years, many geometric methods for the prediction of ligand-binding sites have been developed.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16995956 PMCID: PMC1601958 DOI: 10.1186/1472-6807-6-19
Source DB: PubMed Journal: BMC Struct Biol ISSN: 1472-6807
Figure 1Pocket identification methods. a. POCKET, LIGSITE, and LIGSITEscan the grid for protein-solvent-protein and surface-solvent-surface events, respectively. POCKET uses 3, LIGSITE and LIGSITE7 directions. POCKET and LIGSITE use atom coordinates while LIGSITEuses the Connolly surface. b. SURFNET places a sphere, which must not contain any atoms, between two atoms. The spheres with maximal volume define the largest pocket. c. CAST triangulates the surface atoms and clusters triangles by merging small triangles to neighbouring large triangles. d. PASS coats the protein with probe spheres, selects probes with many atom contacts, and then repeats coating until no new probes are kept. The pockets, or active site points, are the probes with large number of atom contacts.
The PDB code of 210 protein-ligand complexes taken from the PLD database.
| 1a0q | 1a28 | 1a42 | 1a4g | 1a6w | 1a9u | 1aaq | 1abe | 1ac0 | 1acj | 1aco | 1adb |
| 1add | 1adf | 1aec | 1aha | 1ai5 | 1aj7 | 1ake | 1anf | 1aoe | 1apt | 1ase | 1azm |
| 1b59 | 1b6n | 1b9v | 1baf | 1bap | 1bcd | 1bgo | 1bhf | 1bl7 | 1blh | 1bma | 1bmq |
| 1bra | 1byb | 1byg | 1c2t | 1c5c | 1c5x | 1c83 | 1cbs | 1cbx | 1cdg | 1ckp | 1cla |
| 1cle | 1coy | 1cps | 1cqp | 1ctr | 1ctt | 1d0l | 1d3h | 1dbb | 1dd7 | 1dg5 | 1dhf |
| 1did | 1dih | 1dmp | 1dog | 1dr1 | 1e96 | 1eap | 1ebg | 1eed | 1ei1 | 1ejn | 1ela |
| 1eoc | 1epb | 1eta | 1exw | 1f0r | 1fbl | 1fen | 1fgi | 1fkb | 1fki | 1fmo | 1frp |
| 1glp | 1gpy | 1hak | 1hbv | 1hdy | 1hew | 1hfc | 1hti | 1hyt | 1ibg | 1icn | 1ida |
| 1imb | 1inc | 1ivb | 1ivc | 1jao | 1l82 | 1lah | 1lcp | 1ldm | 1lgr | 1lic | 1lmo |
| 1lpm | 1mbi | 1mfc | 1mmp | 1mmq | 1mrg | 1mrk | 1mts | 1mup | 1nco | 1nsc | 1okl |
| 1pbd | 1pdz | 1pgp | 1pha | 1poc | 1ppi | 1ppk | 1pso | 1qbr | 1qcf | 1qh7 | 1qpe |
| 1rbp | 1rds | 1rgk | 1rne | 1rob | 1rpa | 1rt2 | 1sln | 1slt | 1snc | 1sre | 1stp |
| 1tdb | 1thl | 1tlc | 1tng | 1tph | 1ukz | 1ulb | 1uvs | 1vgc | 1xid | 1ydr | 2aad |
| 2ack | 2ada | 2ak3 | 2cmd | 2cpp | 2csc | 2ctc | 2er0 | 2fox | 2gbp | 2gpb | 2ifb |
| 2msb | 2phh | 2pk4 | 2qwb | 2sim | 2sns | 2tsc | 2xis | 2yhx | 2ypi | 3cla | 3dfr |
| 3er3 | 3ert | 3fx2 | 3gch | 3gpb | 3hvt | 3nos | 3ts1 | 4cts | 4dfr | 4est | 4gr1 |
| 4hvp | 4lbd | 4mbp | 4tln | 4xia | 5abp | 5cpp | 5er1 | 5p21 | 5p2p | 6acn | 6cpa |
| 6rnt | 6rsa | 7lpr | 7tim | 9aat | 9icd |
Comparison of LIGSITE, LIGSITE, PASS, SURFNET, CAST on 48 unbound structures.
| Complex | Unbound | LIGSITE | LIGSITE2 | PASS3 | SURFNET4 | CAST | |||||
| PDB | Hits5 | Hits | Hits | Hits7 | Hits | ||||||
| 1bid | 3tms | 1 | 3.4 | 1 | 2.0 | 1 | 3.9 | 1 | 3.9 | 1 | 3.1 |
| 1cdo | 8adh | 1 | 0.8 | 1 | 0.6 | 1 | 0.2 | 1 | 1.3 | 1 | 0.8 |
| 1dwd | 1hxf | 1 | 1.7 | 1 | 2.3 | 1 | 0.7 | 1 | 2.3 | 1 | 0.9 |
| 1fbp | 2fbp | 1 | 0.5 | 1 | 0.6 | (2) | 0.8 | - | - | 1 | 1.5 |
| 1gca | 1gcg | 1 | 0.8 | 1 | 0.8 | 1 | 0.5 | 1 | 3.4 | 1 | 0.5 |
| 1hew | 1hel | 1 | 1.8 | 1 | 1.8 | 1 | 1.0 | 1 | 2.6 | 1 | 1.6 |
| 1hyt | 1npc | 1 | 1.2 | 1 | 1.1 | 1 | 1.7 | 1 | 1.0 | 1 | 0.7 |
| 1inc | 1esa | 1 | 2.9 | 3 | 0.8 | - | - | 1 | 1.9 | (10) | 2.1 |
| 1rbp | 1brq | 1 | 0.9 | 1 | 0.9 | 1 | 0.9 | (2) | 1.6 | 1 | 1.0 |
| 1rob | 8rat | 1 | 0.9 | 2 | 1.0 | 1 | 0.3 | 1 | 1.7 | 1 | 1.6 |
| 1stp | 1swb | 1 | 0.6 | 1 | 0.3 | 1 | 0.8 | 1 | 2.4 | 1 | 1.4 |
| 1ulb | 1ula | - | - | (20) | 3.2 | - | - | 1 | 3.6 | 1 | 3.3 |
| 2ifb | 1ifb | 1 | 2.2 | 1 | 2.2 | 1 | 2.5 | 1 | 2.3 | 1 | 2.1 |
| 3ptb | 3ptn | 1/2 | 1.1 | 2 | 1.0 | (2) | 0.5 | (2) | 1.7 | 1 | 0.9 |
| 2ypi | 1ypi | - | 3.0 | 2 | 3.0 | (3) | 2.2 | - | - | (2) | 2.7 |
| 4dfr | 5dfr | 1 | 1.9 | 1 | 3.5 | 1 | 2.3 | - | - | 1 | 4.5 |
| 4phv | 3phv | 1 | 2.7 | 1 | 2.6 | - | - | 1 | 2.9 | 1 | 2.6 |
| 5cna | 2ctv | 1/11 | 1.0 | (13) | 1.0 | (2) | 0.8 | (6) | 1.1 | (6) | 1.0 |
| 7cpa | 5cpa | 1 | 1.0 | 1 | 1.1 | 1 | 1.3 | 1 | 1.6 | (3) | 1.0 |
| 1a6w | 1a6u | 1/3 | 0.5 | (4) | 1.4 | - | - | - | - | 1 | 1.4 |
| 1acj | 1qif | - | 3.5 | - | 3.6 | 1 | 1.9 | - | - | (40) | 3.9 |
| 1apu | 3app | - | 1.2 | - | 1.9 | - | - | 1 | 3.7 | -(2) | -(4.1) |
| 1blh | 1djb | 1 | 0.7 | 2 | 1.2 | 1 | 2.4 | (2) | 3.9 | (5) | 0.8 |
| 1byb | 1bya | 1 | 2.5 | 1 | 2.8 | (4) | 1.1 | -1 | -(4.2) | 1 | 2.4 |
| 1hfc | 1cge | 1 | 0.7 | 1 | 0.9 | (3) | 0.8 | (3) | 1.2 | 1 | 0.5 |
| 1ida | 1hsi | 1 | 3.4 | 1 | 2.9 | (3) | 1.0 | 1 | 1.0 | 1 | 1.6 |
| 1igj | 1a4j | /4 | 0.8 | -(19) | 2.9 | - | - | - | - | - | - |
| 1imb | 1ime | 1 | 1.7 | 1 | 1.0 | 1 | 1.7 | 1 | 4.0 | 1 | 1.3 |
| 1ivd | 1nna | 1 | 1.4 | 1 | 1.1 | 1 | 3.5 | (2) | 0.9 | 1 | 1.9 |
| 1mrg | 1ahc | 1 | 1.9 | 1 | 1.9 | - | - | 1 | 3.3 | 1 | 0.8 |
| 1mtw | 2tga | 1/5 | 2.8 | -(7) | 1.2 | - | - | (7) | 3.2 | (8) | 1.6 |
| 1okm | 4ca2 | 1 | 2.2 | 1 | 1.6 | - | - | (3) | 2.2 | 1 | 2.1 |
| 1pdz | 1pdy | 1 | 2.6 | 1 | 3.1 | 1 | 1.7 | - | - | (5) | 1.0 |
| 1phd | 1phc | 1 | 0.7 | 1 | 1.2 | 1 | 1.8 | (2) | 1.4 | 1 | 1.3 |
| 1pso | 1psn | 1 | 0.8 | 1 | 1.6 | 1 | 1.6 | -1 | -(4.3) | 1 | 2.1 |
| 1qpe | 3lck | 2 | 1.5 | 2 | 1.2 | 1 | 0.7 | - | - | - | - |
| 1rne | 1bbs | 1 | 1.0 | 1 | 1.2 | 1 | 1.4 | 1 | 2.2 | 1 | 1.0 |
| 1snc | 1stn | 1 | 1.5 | 1 | 1.5 | 1 | 1.3 | 1 | 1.9 | 1 | 1.3 |
| 1srf | 1pts | 1 | 1.5 | 1 | 0.5 | 1 | 1.2 | (5) | 0.8 | 1 | 1.1 |
| 2ctc | 2ctb | 1 | 0.6 | 1 | 1.1 | (2) | 0.8 | 1 | 2.2 | 1 | 1.2 |
| 2h4n | 2cba | 1/2 | 1.0 | 2 | 1.0 | - | - | (3) | 1.2 | (2) | 1.2 |
| 2pk4 | 1krn | 1/2 | 0.7 | 2 | 0.8 | - | - | (2) | 2.2 | 1 | 1.9 |
| 2sim | 2sil | 1/2 | 0.7 | 2 | 0.6 | - | - | (2) | 2.3 | (2) | 0.8 |
| 2tmn | 1l3f | - | 2.1 | - | - | - | - | 1 | 0.7 | 1 | 3.9 |
| 3gch | 1chg | 10 | 2.2 | -(10) | 2.2 | 1 | 0.9 | (11) | 1.5 | (2) | 2.5 |
| 3mth | 6ins | 9 | 3.8 | -(9) | 1.8 | - | - | -(3) | -(4.7) | - | - |
| 5p2p | 3p2p | 1 | 1.3 | 1 | 1.6 | 1 | 1.8 | (2) | 1.6 | (2) | 1.5 |
| 6rsa | 7rat | 1/4 | 0.9 | -(5) | 1.1 | 1 | 1.1 | 1 | 0.6 | 1 | 0.9 |
1Grid resolution: 1.0 Å; probe radius: 1.6 Å.
2Parameters are the same as LIGSITE.
3The values are directly taken from PASS [9]. Only the best hit is shown.
4Grid separation: 1.0 Å. Minimum and maximum radius for gap spheres: 1.0 and 4.0 Å. The "gaps.pdb" file is used for representation for pocket sites.
5Hits: PS(s) lying within 4 Å of the superimposed ligand. Only the best hit is shown. A dash indicates that no hit is found, brackets indicate hits, which are no top hits.
6Distances from hits to the nearest atom of superimposed ligand, unit: Å.
7PS(s) lying within 4 Å of the superimposed ligand.
Overview of the data set of 48 bound/unbound structures.
| Complex | Unbound | RMSD (Å)1 | Protein Description | Ligand Description2 |
| 1bid | 3tms | 0.24 | Thymidylate synthase | CBX, UMP |
| 1cdo | 8adh | 1.17 | Alcohol dehydrogenase | NAD |
| 1dwd | 1hxf | 0.44 | Alpha thrombin + hirudin | MID |
| 1fbp | 2fbp | 0.89 | Phosphohydrolase | AMP, F6P |
| 1gca | 1gcg | 0.32 | Galactose-binding protein | GAL |
| 1hew | 1hel | 0.21 | Acetylchitotriose | NAG |
| 1hyt | 1npc | 0.87 | Thermolysin | DMS, BZS |
| 1inc | 1esa | 0.21 | Elastase | ICL |
| 1rbp | 1brq | 0.54 | Retinol binding protein | RTL |
| 1rob | 8rat | 0.28 | Ribonuclease A | C2P |
| 1stp | 1swb | 0.33 | Streptavidin | BTN |
| 1ulb | 1ula | 0.61 | Purine nucleoside phosphorylase | GUN |
| 2ifb | 1ifb | 0.37 | Fatty acid binding protein | PLM |
| 3ptb | 3ptn | 0.26 | Beta trypsin | BEN |
| 2ypi | 1ypi | 0.57 | Triose phosphate isomerase | PGA |
| 4dfr | 5dfr | 0.80 | Dihydrofolate reductase | MTX |
| 4phv | 3phv | 1.28 | HIV 1 protease | VAC |
| 5cna | 2ctv | 0.44 | Concanavalin A | MMA |
| 7cpa | 8adh | 2.17 | Carboxypeptidase | FVF |
| 1a6w | 1a6u | 0.35 | B1-8 FV fragment | NIP |
| 1apu | 3app | 0.36 | Penicillopepsin | MAN, OET, IVA, STA |
| 1acj | 1qif | 0.34 | Acetylcholinesterase | THA |
| 1blh | 1djb | 0.23 | Methyl]phosphonate | FOS |
| 1byb | 1bya | 0.26 | Beta amylase | GLC |
| 1hfc | 1cge | 0.37 | Fibroblast collagenase | HAP |
| 1ida | 1hsi | 1.41 | HIV 2 protease | QND, HPB, PY2, PPL |
| 1ivd | 1nna | 1.00 | Sialidase | FUC, ST1, NAG, MAN |
| 1mrg | 1ahc | 0.30 | Alpha momorcharin | AND |
| 1mtw | 2tga | 0.31 | Trypsin | DX9 |
| 1okm | 4ca2 | 0.34 | carbonic anhydrase II | SAB |
| 1pdz | 1pdy | 0.54 | Enolase | PGA |
| 1phd | 1phc | 0.17 | Camphor 5-monoxygenase | HEM, PIM |
| 1pso | 1psn | 0.33 | Pepsin 3a | IVA, STA |
| 1qpe | 3lck | 0.25 | Lck kinase | PP2, PTR |
| 1rne | 1bbs | 0.60 | Renin | NAG, C60 |
| 1snc | 1stn | 0.52 | Staphylococcal nuclease | PTP |
| 1srf | 1pts | 0.45 | Streptavidin | MTB |
| 1stp | 2rta | 0.62 | Streptavidin | BTN |
| 2ctc | 2ctb | 0.15 | Carboxypeptidase | LOF |
| 2h4n | 2cba | 0.33 | Carbonic anhydrase II | AZM |
| 2pk4 | 1krn | 0.63 | Plasminogen kringle | ACA |
| 2sim | 2sil | 0.25 | Sialidase (neuraminidase) | DAN |
| 2tmn | 1l3f | 0.62 | Thermolysin | PHO, NH2 |
| 3gch | 1chg | 0.91 | Gamma chymotrypsin | CIN |
| 3mth | 6ins | 1.00 | Methylparaben insulin | MPB |
| 5p2p | 3p2p | 0.62 | Phosphilipase | DHG |
| 1imb | 1ime | 1.45 | Inositol monophosphatase | LIP |
| 6rsa | 7rat | 2.08 | Ribonuclease | UVC |
1RMSD: Root mean square deviation of Cα atoms after superimposing unbound structures on bound structures.
2There letters abbreviation in PDB, separated by "," if more than one
Figure 2Left: Hen egg-white lysozyme with its ligand Tri-N-Acetylchitotriose (PDB 1hel). The ligand binds in a deep pocket and all algorithms correctly predict the binding site. red: LIGSITE, blue: LIGSITE, cyan: PASS, yellow: SURFNET, orange: CAST. Right: Hexameric insulin with its ligand methylparaben (PDB 6ins). The binding site of the ligand is unusually flat and therefore none of the methods detects it correctly.
Success rates for 210 bound structures.
| Method | Top1 | Top3 |
| LIGSITE | 75% | |
| LIGSITE | 67% | 87% |
| LIGSITE | 65% | 85% |
| PASS | 54% | 79% |
| SURFNET | 42% | 56% |
Numbers of protein in each class for 210 bound structures.
| Class | No. of proteins (as %) | Avg no. pocket points | Stdev |
| Class 1: Binding site in largest pocket | 141/210 = 67% | 209 | 185 |
| Class 2: Binding site in second largest pocket | 28/210 = 13% | 66 | 64 |
| Class 3: Binding site in third largest pocket | 14/210 = 7% | 40 | 41 |
| Class 4: Binding site in none of above | 27/210 = 13% |
Success rates for 48 unbound/bound structures (percentage).
| Method | Top 1 | Top 3 | ||
| unbound | bound | unbound | bound | |
| LIGSITE | 71 | 79 | ||
| LIGSITE | 60 | 69 | 77 | 87 |
| LIGSITE | 58 | 69 | 75 | 87 |
| CAST | 58 | 67 | 75 | 83 |
| PASS | 60 | 63 | 71 | 81 |
| SURFNET | 52 | 54 | 75 | 78 |
Figure 3Mapping pockets and degree of conservation onto a protein surface (1krn). The first two pockets have similar size (ratio: 1.3). The residue near the second largest pocket (right, yellow), which is the ligand binding site, are more conserved than those near the largest pocket (left, yellow). Red: highly conserved, grey: less conserved.
Figure 4The success rates of LIGSITEfor different thresholds for the minimal number of surface-solvent-surface events, MINSSS, for top 3 predictions for 210 bound structures.
Figure 5Limits of LIGSITE: The hole in a ring structure (pdbid 1a4j) is predicted by LIGSITEas largest pocket. The ligand binds, however, to the second largest pocket shown on the left.
Figure 6The occupancy of ligands on predicted pocket sites. Grey: the whole pocket sites, Red: mass center of pocket sites and Magenta: ligand. a). Carbonic anhydrase II (2cba), a perfect prediction. b). Acetylchitotriose (1hel) good prediction but only a small part of ligand atoms occupy the pocket sites. c). Purine nucleoside phosphorylase (1ula), the pocket sites cover all atoms of the ligand. The minimal distance is 5.10 Å since ligand is very small and it is not counted as a hit.