| Literature DB >> 15608178 |
Shu-Hsien Sheu1, David R Lancia, Karl H Clodfelter, Melissa R Landon, Sandor Vajda.
Abstract
PRECISE (Predicted and Consensus Interaction Sites in Enzymes) is a database of interactions between the amino acid residues of an enzyme and its ligands (substrate and transition state analogs, cofactors, inhibitors and products). It is available online at http://precise.bu.edu/. In the current version, all information on interactions is extracted from the enzyme-ligand complexes in the Protein Data Bank (PDB) by performing the following steps: (i) clustering homologous enzyme chains such that, in each cluster, the proteins have the same EC number and all sequences are similar; (ii) selecting a representative chain for each cluster; (iii) selecting ligand types; (iv) finding non-bonded interactions and hydrogen bonds; and (v) summing the interactions for all chains within the cluster. The output of the search is the color-coded sequence of the representative. The colors indicate the total number of interactions found at each amino acid position in all chains of the cluster. Clicking on a residue displays a detailed list of interactions for that residue. Optional filters allow restricting the output to selected chains in the cluster, to non-bonded or hydrogen bonding interactions, and to selected ligand types. The binding site information is essential for understanding and altering substrate specificity and for the design of enzyme inhibitors.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15608178 PMCID: PMC540045 DOI: 10.1093/nar/gki091
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Statistics for enzyme structures in the PDB. (a) Distribution of the 23 872 enzyme chains in the PDB among the 1176 different EC numbers. The graph is truncated at 70 enzyme chains per EC number; there are isolated instances of higher values up to 933 enzyme chains per EC number. (b) Distribution of the 2280 sequence-similar enzyme clusters among the 1176 EC numbers. The graph is truncated at 10 clusters per EC number; there are isolated instances of higher values up to 44 clusters per EC number. (c) Distribution of the 23 872 enzyme chains in the PDB among the 2280 sequence-similar clusters. The graph is truncated at 40 enzyme chains per cluster; there are isolated instances of higher values up to 475 enzyme chains per cluster. (d) Distribution of clusters with given levels of amino acid sequence identity.
Figure 2Distribution of the 23 872 enzyme chains among categories defined as having a given percentage of the residues with certain number of interactions, showing that, in most enzymes, 1–2% of the residues have 20–30 interactions which form the peak of the two-dimensional distribution with variables representing the percentage of residues and the number of interactions per residue.
Figure 3Output of PRECISE for the PDB id 2TLX (thermolysin). (a) Main output page. Chain E of 8TLN is the representative of the cluster containing 2TLX. The color-coded sequence indicates the residues that belong to the binding site, as well as their frequencies of occurrence in enzyme–ligand complexes. The panel on the right shows all the PDB codes and chain identifiers of the entries that form the cluster. The user may select any subset of these entries and recalculate the list of interactions. The two additional panels permit the users to restrict the set of interactions to selected interaction types (i.e. non-bonded or hydrogen bond) and to selected ligand types (i.e. peptides, nucleotides, cofactors, metal ions, other inorganic ions or ‘others’). Any subset of these can be selected to produce the list of interactions. Clicking on any ‘colored’ residue displays the panel shown in (b). (b) Part of the detailed list of interactions for a residue in 2TLX. For each interaction, the list shows the PDB code and chain identifier of the protein; the name, heteroatom code, and type of the ligand; the interacting residue and atom in the protein, and the type of the interaction (non-bonded or hydrogen bond). The list shows both the ‘interaction position’, i.e. the original sequence number of the interacting residue in the PDB file, and the ‘aligned position’, which is the sequence number of the same residue in the alignment of sequences for the entire cluster.