| Literature DB >> 19055796 |
Patrick Slama1, Ioannis Filippis, Michael Lappe.
Abstract
BACKGROUND: Identifying the active site of an enzyme is a crucial step in functional studies. While protein sequences and structures can be experimentally characterized, determining which residues build up an active site is not a straightforward process. In the present study a new method for the detection of protein active sites is introduced. This method uses local network descriptors derived from protein three-dimensional structures to determine whether a residue is part of an active site. It thus does not involve any sequence alignment or structure similarity to other proteins. A scoring function is elaborated over a set of more than 220 proteins having different structures and functions, in order to detect protein catalytic sites with a high precision, i.e. with a minimal rate of false positives.Entities:
Mesh:
Substances:
Year: 2008 PMID: 19055796 PMCID: PMC2632678 DOI: 10.1186/1471-2105-9-517
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Comparison of the performance of predictions of catalytic residues using different scoring functions and threshold values on the extended protein set
| Precision | Coverage | |||
| Closeness + RSAa | 15.13% | 11.54% | 8.22% | 31.66% |
| Eq. 1b, | 20.82% | - | 15.42% | 32.05% |
| Eq. 1, | - | 20.56% | 28.10% | 9.91% |
Calculations were run over the residue interaction networks derived from the 226 protein structures from our extended test set. Scoring functions used here are described in Methods. Values for precision and coverage were obtained over the whole set. F1 and F2 respectively represent the F-measure defined in Methods when using β = 1 and β = 2. a As proposed by Amitai [14]. RSA: residue surface accessibility. bResidues were considered as catalytic if their MDev value for the scoring function defined in Equation 1 was superior to the indicated threshold value. Corresponding 'specificity' (equal to (p-, r-)/r-) values were 97.80% at MDev1, 99.68% at MDev2, and 95.57% when using closeness combined to RSA.
Figure 1Values of Scores were calculated for all residues from the extended test set. The values of the effectiveness measures F1 and F2, as defined in Methods (with β = 1, as triangles, F1, and with β = 2, F2, as circles), were calculated when classifying as catalytic the residues with an MDev value superior to thresholds ranging from 0.2 to 1. The respective thresholds that produced maximal values for respectively F1 and F2, MDev1 and MDev2, are indicated.
Figure 2Per-protein ranges obtained on the precision (A and C) and coverage (B and D) of the detection when considering as catalytic the residues with an MDev values were calculated on each residue of the extended test set from scores calculated according to Equation 1.
Results at the residue scale: detection of catalytic and functional residues over the proteins from the validation set.
| Protein | Residues predicted as catalytica | Non-detected catalytic residues | Commentsb |
| TEM β-lactamase | Lys73*, | Ser70, Ser130 | Lys234 forms H-bond with substrate analogue-binding water |
| Pancreatic phospholipase | Gly30 | Asp49 binds Ca | |
| Alkylguanine-transferase | Tyr69, His71, | Asn137, Cys145 | Glu172→His146 activates Cys145 by deprotonation, Lys165 mutations affect activity |
| Ubiquitin-conjugating enzyme 1 | Lys36, | Cys88 | Detected residues define a single site in structure |
| Phenylalanine hydroxylase | His138*, | Ser203 | Asp139 forms H-bond with Fe-bound H2O |
| Prolyl-isomerase 1 | His59*, Glu145, | Cys113 | Glu145 plays a role in the two-domain arrangement of the protein |
| Ferric binding protein | His9*, | Tyr195, Tyr196 | Arg101 (not conserved) interacts with ligand, Glu57 interacts with ligand and binds iron |
| Bovine β-trypsin | His57, Asp102, Gly193, Gly196, Ser214 | Asp189 forms H-bond with substrate-bound water, Tyr228 is H-bonded to Asp189 through H2O | |
a All residues predicted using our scoring parameter at threshold MDev1. In bold, those also predicted at threshold MDev2. With * superscript, residues that are 'catalytic' according to CSA definition. Residues that are 'functional' according to our definition (see text) but not 'catalytic' according to CSA are underlined. bDerived from the analysis of multiple crystallized states. See text for references.
Comparison of performances of detections carried out on the validation set using different threshold values
| Catalytic | Functional | |||
| Threshold on | Coverage | Precision | Precision | |
| 33.5% | 44.4% | 31.6% | 65.8% | |
| 27.8% | 20% | 45.5% | 72.7% | |
Calculations were run on the 8 protein structures of the validation set. Our scoring function (Equation 1) was used, with detection at two different thresholds. Residues were considered as positives (catalytic or functional) if their MDev value was superior to the threshold value. Performances are expressed with respect to the whole set. aValues correspond to measure F1 for MDev1 and F2 for MDev2.
Figure 3Catalytic residues detected on ubiquitin-conjugating enzyme 1 using our scoring function. Side-chains of residues predicted as catalytic at thresholds MDev1 and MDev2 are shown as ball-and-sticks, with carbon atoms in orange and green, respectively. The second monomer present in the crystal structure is shown in cyan. The active-site cystein residue, Cys88, is shown as ball-and-sticks, with carbon atoms in purple.
Figure 4Visualisation of the catalytic and functional residues detected using our scoring function on two proteins from the validation set. Side-chains of residues predicted as catalytic (see Table 2) when using thresholds MDev1 and MDev2 are shown as sticks, with carbon atoms in green and orange, respectively. Bonds to iron are shown as solid lines. Left: Phenylalanine hydroxylase. Iron is shown as a red sphere and water in pink. Right: ferric-binding protein. Iron is shown as a red sphere and phosphorous in yellow.
Figure 5Distribution of Residue numbers correspond to the numbering of the PDB structure and are indicated for residues with values of MDev on our scoring function superior to MDev1. Catalytic residues are indicated with a * superscript, functional residues are underlined.