| Literature DB >> 22359655 |
Sandeep Chakraborty1, Basuthkar J Rao.
Abstract
Promiscuity, the basis for the evolution of new functions through 'tinkering' of residues in the vicinity of the catalytic site, is yet to be quantitatively defined. We present a computational method Promiscuity Indices Estimator (PROMISE)--based on signatures derived from the spatial and electrostatic properties of the catalytic residues, to estimate the promiscuity (PromIndex) of proteins with known active site residues and 3D structure. PromIndex reflects the number of different active site signatures that have congruent matches in close proximity of its native catalytic site, the quality of the matches and difference in the enzymatic activity. Promiscuity in proteins is observed to follow a lognormal distribution (μ = 0.28, σ = 1.1 reduced chi-square = 3.0E-5). The PROMISE predicted promiscuous functions in any protein can serve as the starting point for directed evolution experiments. PROMISE ranks carboxypeptidase A and ribonuclease A amongst the more promiscuous proteins. We have also investigated the properties of the residues in the vicinity of the catalytic site that regulates its promiscuity. Linear regression establishes a weak correlation (R(2)∼0.1) between certain properties of the residues (charge, polar, etc) in the neighborhood of the catalytic residues and PromIndex. A stronger relationship states that most proteins with high promiscuity have high percentages of charged and polar residues within a radius of 3 Å of the catalytic site, which is validated using one-tailed hypothesis tests (P-values∼0.05). Since it is known that these characteristics are key factors in catalysis, their relationship with the promiscuity index cross validates the methodology of PROMISE.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22359655 PMCID: PMC3281107 DOI: 10.1371/journal.pone.0032011
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Promiscuity of proteins.
(a) Mean, Standard Deviation and Median of promiscuity index (PromIndex) computed on sample sets composed of 20 proteins from each EC, and at least 30 values of PromIndex for each protein, the active site comprising of residues within a radius of 5 Å from the catalytic residues. (b) The mean from balanced sample sets with respect to EC number compared to the PromIndex computed from the full set of proteins (305 proteins). (c) PromIndex computed with radii of 1–6 Å. The proteins are sorted according to the PromIndex computed based on a 5 Å radius. (d) PromIndex was fitted for a lognormal distribution (μ = 0.28, σ = 1.1 reduced chi-square = 3.0E-5). (e) Cumulative distribution curve.
Most promiscuous proteins as computed by PROMISE: (protein with PDB id: 5CPA is the most promiscuous).
| PDB | P | M | EC | L | NATIVE FUNCTION | POSSIBLE PROMISCUOUS FUNCTIONS |
| 5CPA | 1 | Y | 3.4.17.1 | 307 | CARBOXYPEPTIDASE A | cytochrome c peroxidase, pyruvate oxidase, thymidine phosphorylase, hexokinase pii, chloramphenicol acetyltransferase |
| 1MPY | 0.9 | Y | 1.13.11.2 | 307 | CATECHOL 2,3- DIOXYGENASE | dipeptidyl peptidase iv soluble form, purple acid phosphatase, bacterial luciferase, phosphoinositide-specific phospholipase c, |
| 1AQ2 | 0.85 | N | 4.1.1.49 | 540 | PHOSPHOENOLPYRUVATE CARBOXYKINASE | glutamate semialdehyde aminotransferase, endo/exocellulase e4, deoxycytidylate hydroxymethylase, aldolase carbamate carbamoylphosphate synthetase |
| 1QZ9 | 0.84 | Y | 3.7.1.3 | 416 | KYNURENINASE | glycine n-methyltransferase, leucyl-trna synthetase, cysteinyl-trna synthetase, citrate synthase, cytochrome p450 2b4 |
| 1EHI | 0.84 | Y | 6.3.2.4 | 377 | D-ALANINE:D-LACTATE LIGASE | ribonuclease alpha-sarcin, cytochrome p450 2b4, glutamate semialdehyde aminotransferase, ermc' methyltransferase, chitinase |
| 5RSA | 0.83 | N | 3.1.27.5 | 124 | RIBONUCLEASE A | thymidine phosphorylase, putative biotin ligase, nadh-dependent nitrate reductase, adenylosuccinate synthetase, carboxykinase |
| 1I9A | 0.77 | Y | 5.3.3.2 | 182 | ISOPENTENYL-DIPHOSPHATE DELTA-ISOMERASE | acid beta-glucosidase, ribonuclease rh, chitinase a, purine nucleoside phosphorylase, leucyl-trna synthetase |
| 1M9C | 0.76 | N | 5.2.1.8 | 165 | CYCLOPHILIN A | nadh-dependent nitrate reductase, uracil phosphoribosyltransferase, cyclooxygenase-2, cytochrome c peroxidase, dehydrogenase |
| 1ONE | 0.74 | Y | 4.2.1.11 | 436 | ENOLASE | purine nucleoside phosphorylase, n-ethylmaleimide sensitive factor, oxygen-insensitive nadph nitroreductase, ribonuclease alpha-sarcin, biotin ligase |
| 1GUM | 0.63 | N | 2.5.1.18 | 222 | GLUTATHIONE TRANSFERASE | uracil phosphoribosyltransferase, phosphomannose isomerase, carboxykinase, tetrahydrodipicolinate n-succinyltransferase, biotin ligase |
P: Promiscuity index; M: Is a metal liganded by the active site in the crystal structure; EC - Enzyme Commission number; L: Sequence length.
Least promiscuous proteins as computed by PROMISE: (protein with PDB id: 3R1R is the least promiscuous).
| PDB | P | M | EC | L | NATIVE FUNCTION | POSSIBLE PROMISCUOUS FUNCTIONS |
| 3R1R | 0.01 | N | 1.17.4.1 | 761 | RIBONUCLEOTIDE REDUCTASE R1 | glutamine phosphoribosylpyrophosphate amidotr, ferredoxin-nadp+ reductase, ribonucleoside triphosphate reductase, |
| 1THG | 0.01 | N | 3.1.1.3 | 544 | LIPASE | deoxyhypusine synthase, ribonuclease t1, beta-glucuronidase, 2-enoyl-coa hydratase, ribonuclease alpha-sarcin |
| 1B6G | 0.01 | N | 3.8.1.5 | 310 | HALOALKANE DEHALOGENASE | purine nucleoside phosphorylase, thymidylate synthase, |
| 2ADM | 0.02 | N | 2.1.1.72 | 421 | METHYLTRANSFERASE | cytochrome p450 2b4, glutamine phosphoribosylpyrophosphate amidotr, |
| 1POW | 0.02 | Y | 1.2.3.3 | 585 | PYRUVATE OXIDASE | fructose-1, 6-bisphosphatase, quinone reductase, lysozyme, adenylate kinase |
| 1MEK | 0.02 | N | 5.3.4.1 | 120 | DISULFIDE ISOMERASE | (glutathione transferase a4-4), caspase-8, thymidylate synthase, ctp synthetase, glutamine phosphoribosylpyrophosphate amidotr |
| 1T7D | 0.02 | N | 3.4.21.89 | 250 | SIGNAL PEPTIDASE I | udp-n-acetylmuramoyl-l-alanine/:d-glutamate l, d-dopachrome tautomerase, acetylglutamate kinase, |
| 2CPU | 0.03 | N | 3.2.1.1 | 496 | ALPHA-AMYLASE | catechol 2, 3-dioxygenase, aspartyl-trna synthetase, dihy-dropteroate synthase, phospholipase a2 |
P: Promiscuity index; M: Is a metal liganded by the active site in the crystal structure; EC - Enzyme Commission number; L: Sequence length.
Predicted residues, pairwise distances and potential differences in carboxypeptidase A and cytochrome c peroxidases using the motif (Ala48,His52,Trp191) from a cytochrome c peroxidase (PDB id: 1DJ1).
| Predicted Residues | Distances in Å | Potential differences | |||||||
| a | b | c | ab | ac | bc | ab | ac | bc | |
| 1DJ1 | Ala48 | His52 | Trp191 | 6.3 | 11.4 | 13.8 | 329.1 | 317.2 | −11.9 |
| 5CPA | Ala143 | His69 | Trp63 | 7.5 | 11.5 | 13.3 | 321.1 | 255.3 | −65.7 |
Potential differences are in units of kT/e (k is Boltzmanns constant, T is the temperature in K and e is the charge of an electron).
Figure 2Scaffold for directed evolution.
Superimposition of the predicted residues in a carboxypeptidase A (PDB id - 5CPA: in green) on a motif of active site residues from a cytochrome c peroxidase (PDB id - 1DJ1: in blue). The residues are colored in yellow and red for carboxypeptidase A and cytochrome c peroxidase respectively.
Figure 3Curve fitting using Gnuplot with varying radii and various characteristics of the residues within a shell of that radius.
(a) Percentage of polar residues. (b) Percentage of acidic residues. (c) Percentage of basic residues. (d) Percentage of charged (acidic and basic) residues. (e) Coefficient of determination R2 plotted with increasing distance from the active site.
Figure 4Promiscuity index plotted against the characteristics of residues within a radial distance of 5, 8 and 15 Å from the catalytic residues.
(a) Percentage of polar residues. (b) Percentage of charged residues.
The mean and variance in the % of polar and charged residues as the distance from the active site increases for about ∼300 proteins.
| % of Polar residues | % of Charged residues | |||||
| Distance(Å) | Mean | Median | SD | Mean | Median | SD |
| 1 | 79 | 100 | 24 | 65 | 66 | 27 |
| 2 | 79 | 100 | 24 | 64 | 66 | 28 |
| 3 | 73 | 75 | 23 | 55 | 57 | 25 |
| 4 | 57 | 57 | 18 | 37 | 35 | 18 |
| 5 | 50 | 50 | 15 | 30 | 28 | 14 |
| 6 | 46 | 46 | 12 | 26 | 25 | 11 |
| 7 | 44 | 44 | 11 | 23 | 22 | 9 |
| 8 | 42 | 42 | 9 | 21 | 21 | 7 |
| 9 | 41 | 40 | 8 | 21 | 20 | 7 |
| 10 | 40 | 40 | 7 | 20 | 20 | 6 |
| 11 | 39 | 39 | 6 | 19 | 19 | 5 |
| 12 | 38 | 38 | 6 | 19 | 19 | 5 |
| 13 | 38 | 38 | 5 | 19 | 19 | 5 |
| 14 | 38 | 38 | 5 | 19 | 19 | 4 |
| 15 | 38 | 39 | 5 | 20 | 19 | 4 |
Figure 5One-Tailed Hypothesis Test for proportion. The alternate hypothesis is that proteins with PromIndex >0.3, more than 80% have more than 45% of polar residues or 25% of charged residues in the vicinity of the active site.
(a) Sample proportion – i.e. % of proteins with PromIndex >0.3 that have more than 45% of polar residues or 25% of charged residues in the vicinity of the active site. (b) σ value (for P = 0.8) (c) z-scores (d) P-values.
Figure 6One-Tailed Hypothesis Test for proportion: The alternate hypothesis is that 80% of proteins with less than 45% of polar residues or 25% of charged residues in the vicinity of the active site have PromIndex <0.3.
(a) Sample proportion – i.e. % of proteins that have less than 45% of polar residues or 25% of charged residues in the vicinity of the active site, and have PromIndex <0.3. (b) σ value (for P = 0.8) (c) z-scores (d) P-values.