Literature DB >> 17597871

Ligation site in proteins recognized in silico.

Michal Brylinski¹, Leszek Konieczny, Irena Roterman.

Abstract

UNLABELLED: Recognition of a ligation site in a protein molecule is important for identifying its biological activity. The model for in silico recognition of ligation sites in proteins is presented. The idealized hydrophobic core stabilizing protein structure is represented by a three-dimensional Gaussian function. The experimentally observed distribution of hydrophobicity compared with the theoretical distribution reveals differences. The area of high differences indicates the ligation site. AVAILABILITY: http://bioinformatics.cm-uj.krakow.pl/activesite.

Entities: Chemical Disease Gene

Year: 2006 PMID： 17597871 PMCID： PMC1891674 DOI： 10.6026/97320630001127

Source DB: PubMed Journal: Bioinformation ISSN： 0973-2063

Background

The classic model of an oil drop representing the hydrophobic core in proteins given by Kauzmann [1] was intended to visualize the importance of hydrophobic interactions responsible for forming and stabilizing the protein tertiary structure. [2,3,4] The hydrophilic surface with the hydrophobic center of the molecule is generally accepted [5,6] as the model according to which the amino acid sequence partitions a protein into its inside and outside. [7] The model oriented on localization of the area responsible for ligand binding, based on characteristics of spatial distribution of hydrophobicity which changes from protein interior (maximal hydrophobicity) to exterior (close to zero level of hydrophobicity), can be represented by a three-dimensional Gaussian function. [8,9,10 ] The simple comparison of theoretical (Gaussian function) and empirical spatial distributions of hydrophobicity in protein allows identification of the areas of high discrepancy, which, as observed in crystal forms of protein-ligand complexes, can be recognized as ligation sites in proteins.

Methodology

Data

Complexes selected for analysis presented in this paper are: cAMP-dependent protein kinase (PDB ID: 1CDK), cyclin-dependent protein kinase 2 (PDB ID: 1E1V), proto-oncogene tyrosine-protein kinase ABL (PDB ID: 1IEP), S-lectin (PDB ID: 1SLT).

Grid system

The grid system (with constant step size) is constructed for the protein molecule localized with its geometrical center in the origin of the coordinate system (0,0,0)and oriented as follows: longest inter-effective atoms (side chains represented by the geometrical centers) distance along the X-axis and longest distance between projections (on YZ plane) of effective atoms along the Y-axis. The size of the ellipsoid can be calculated by taking the maximum and minimum values of the X, Y and Z coordinates found in the molecule, oriented as above.

Theoretical hydrophobicity distribution:

The theoretical hydrophobicity value for each grid point can be calculated according to a three-dimensional Gaussian function: as given in the PDF file linked below

Empirical hydrophobicity distribution

The empirical hydrophobicity distribution can be calculated using the original function introduced by Levitt [11]:as given in the PDF file linked below

Prediction results

Theoretical versus empirical hydrophobicity distribution

Since theoretical (Equation 1) and empirical (Equation 2) hydrophobicity distributions are standardized, the hydrophobicity values attributed to each grid point can be compared by a simple subtraction: The color scale introduced to express the magnitude of difference ΔĤ in a particular protein (Figure 1) area enables the visualization of the localization of these discrepancies in the protein molecule. The profile of ΔĤi along the polypeptide chain (also in color scale) reveals the fragments of polypeptide of high difference between idealized and empirical hydrophobicity density. The same color scale applied to a three-dimensional representation of protein molecule allows for the localization of the ligation site in the protein molecule. The results of analysis of selected protein molecules are shown in Figure 1.

Figure 1

One-dimensional profiles of ΔĤ per amino acid (color scale) (left column) and three-dimensional distribution of ΔĤ on protein surface (right column): A AMP-dependent protein kinase complexed with 5'-adenyly-imido-triphosphate, B cyclin-dependent protein kinase 2 complexed with 6-O-cyclohexylmethyl guanine, C proto-oncogene tyrosine-protein kinase ABL complexed with STI-571, D S-lectin complexed with D-galactose. The ligands (dark blue thick line) are localized at their binding sites according to crystal structure

Conclusion

The many proteins of unknown biological function, identified on the basis of genome analysis, await a unified automated method for determining their biological activity. [12] The next step is to develop methods able to predict a protein's function from an examination of its structure. Some of the techniques used to identify functionally important residues from the sequence or structure are based on searching for homologues of proteins of known function. [13,14] However, homologues need not have related activity, particularly when the sequence identity is below 25%. [15] The model presented in this paper is oriented on localizing the area responsible for ligand binding, based on the characteristics of the spatial distribution of hydrophobicity in a protein molecule. It is generally accepted that the core region is not well described by a spheroid of buried residues surrounded by surface residues due to hydrophobic channels that permeate the molecule. [16,17] This being so, we should be able to identify regions with high deviation versus the ideal model by making a simple comparison of the theoretical (idealized according to the Gaussian function) and empirical spatial distribution of hydrophobicity in a protein. The regions recognized by high hydrophobicity density differences seem to reveal functionally important sites in proteins.

13 in total

Review 1. From genes to protein structure and function: novel applications of computational approaches in the genomic era.

Authors: J Skolnick; J S Fetrow
Journal: Trends Biotechnol Date: 2000-01 Impact factor: 19.536

Ligation site in proteins recognized in silico.

Background

Methodology

Data

Grid system

Theoretical hydrophobicity distribution:

Empirical hydrophobicity distribution

Prediction results

Theoretical versus empirical hydrophobicity distribution

Conclusion

Review 1. From genes to protein structure and function: novel applications of computational approaches in the genomic era.

2. Practical limits of function prediction.

3. Some factors in the interpretation of protein denaturation.

4. A simplified representation of protein conformations for rapid simulation of protein folding.

5. On the nature of the protein interior.

6. A simple method for displaying the hydropathic character of a protein.

7. Hydrophobic basis of packing in globular proteins.

8. Protein densities.

9. Comparison of molecular structures of proteins: helix content; distribution of apolar residues.

10. Fuzzy-oil-drop hydrophobic force field--a model to represent late-stage folding (in silico) of lysozyme.

1. Localization of ligand binding site in proteins identified in silico.

2. A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation.

3. "Fuzzy oil drop" model applied to individual small proteins built of 70 amino acids.

4. Fast and automated functional classification with MED-SuMo: an application on purine-binding proteins.

5. Recognition of protein complexation based on hydrophobicity distribution.

6. Chaperonin structure: the large multi-subunit protein complex.

7. Prediction of functional sites based on the fuzzy oil drop model.