| Literature DB >> 21918609 |
Yuko Tsuchiya1, Haruki Nakamura, Kengo Kinoshita.
Abstract
A discrimination method between biologically relevant interfaces and artificial crystal-packing contacts in crystal structures was constructed. The method evaluates protein-protein interfaces in terms of complementarities for hydrophobicity, electrostatic potential and shape on the protein surfaces, and chooses the most probable biological interfaces among all possible contacts in the crystal. The method uses a discriminator named as "COMP", which is a linear combination of the complementarities for the above three surface features and does not correlate with the contact area. The discrimination of homo-dimer interfaces from symmetry-related crystal-packing contacts based on the COMP value achieved the modest success rate. Subsequent detailed review of the discrimination results raised the success rate to about 88.8%. In addition, our discrimination method yielded some clues for understanding the interaction patterns in several examples in the PDB. Thus, the COMP discriminator can also be used as an indicator of the "biological-ness" of protein-protein interfaces.Entities:
Keywords: biological interfaces; complementarity analysis; crystal-packing contact; homo-dimer interface; protein-protein interaction
Year: 2008 PMID: 21918609 PMCID: PMC3169932 DOI: 10.2147/aabc.s4255
Source DB: PubMed Journal: Adv Appl Bioinform Chem ISSN: 1178-6949
Figure 1The selection scheme of the most probable biological interfaces. The most probable biological interface in each crystal is selected among the biological contact and the crystal-packing contact(s) according to the scheme shown in this flow chart. The explanation of the scheme is described in the text.
Comparison of the structures determined by X-ray and NMR
| X-ray
| NMR
| |||
|---|---|---|---|---|
| PDB Chain ID | Category | PDB Chain ID | Seq ID | |
| 1 | 1ci4A-B | 1 | 1qckA-B | 97.8 |
| 2ezxA-B | 97.8 | |||
| 2ezyA-B | 97.8 | |||
| 2ezzA-B | 97.8 | |||
| 2 | 1kzkA-B | 1 | 1bveA-B | 91.9 |
| 1bvgA-B | 91.9 | |||
| 3 | 1m1fA-B | 1 | 2c06A-B | 97.3 |
| 4 | 1mkkA-B | 1 | 1katV-W | 91.9 |
| 5 | 1msoB-D | 1 | 1ai0B-D | 100.0 |
| 1aiyB-D | 100.0 | |||
| 2aiyB-D | 100.0 | |||
| 3aiyB-D | 100.0 | |||
| 4aiyB-D | 100.0 | |||
| 5aiyB-D | 100.0 | |||
Note: Sequence identity between the protomer in the X-ray crystal structure and that in the NMR structure.
Comparison of the structures determined in the different crystallization conditions
| Original entry
| Different crystal form
| |||||
|---|---|---|---|---|---|---|
| PDB Chain ID | Category | Evaluation | Space group | PDB Chain ID | Space group | |
| 1 | 1dj8C-D | 1 | biological | P 1 21 1 | 1bg8A-B | C 1 2 1 |
| 2 | 1f4mA-B | 1 | biological | P 32 | 1f4nA-B | C 1 2 1 |
| 3 | 1j59A-B | 1 | biological | C 2 2 21 | 1i5zA-B | P 21 21 21 |
| 4 | 1jm0E-F | 3 | biological | P 21 21 21 | 1jmbB-C | C 2 2 21 |
| 5 | 1ks2A-B | 1 | biological | P1 | 1lkzA-B | C 2 2 21 |
| 6 | 1m0wA-B | 1 | biological | P1 | 1m0tA-B | C 2 2 21 |
| 7 | 1m1nF-H | 1 | biological | P 1 21 1 | 1m34B-D | C 1 2 1 |
| 8 | 1m7gA-B | 2 | biological | P 21 21 21 | 1d6jA-B | C 2 2 21 |
| 9 | 1msoB-D | 1 | biological | H 3 | 1os4B-D | P 1 |
| 1ev6B-D | P 1 21 1 | |||||
| 1gujB-D | P 21 21 21 | |||||
| 1benB-D | R 3 | |||||
| 10 | 1nmsA-B | 1 | biological | C 1 2 1 | 1nmqA-B | P 21 21 21 |
| 11 | 1o7jB-D | 1 | biological | C 1 2 1 | 1hfkA-C | P 61 2 2 |
| 12 | 1oaoA-B | 1 | biological | C 1 2 1 | 1mjgC-D | P1 |
| 13 | 1oh0A-B | 1 | biological | C 1 2 1 | 1e3vA-B | P 21 21 21 |
| 14 | 1p1jA-B | 1 | biological | C 1 2 1 | 1p1hC-D | P 1 21 1 |
Note: The “biological” means that the contact in the biological contact set was judged as the most probable biological interface in the crystal in the evaluation step.
Results of the weight optimization of the COMP
| w1 | w2 | w3 | Wh | We | Ws | MCC | Threshold | Accuracy | Sensitivity | Specificity |
|---|---|---|---|---|---|---|---|---|---|---|
| 78 | 2 | 13 | 0.99 | 0.030 | 0.16 | 0.33 | 0.023 | 0.75 | 0.89 | 0.40 |
Figure 2The relative frequencies of the COMP values in the biological (BIO, thick line) and crystal-packing (CRY, dotted line) contact sets.
Figure 3The relative frequencies of the complementarities for a) hydrophobicity, b) electrostatic potential and c) shape. The thick lines in the three figures indicate the distributions of complementarities in the biological contact set (BIO), and the dotted lines indicate those in the crystal-packing contact set (CRY).
Figure 4The scatter plots between the COMP and the contact area in a) the biological contact set (BIO) and in b) the crystal-packing contact set (CRY). In each figure, each sign indicates each contact, and the horizontal dotted line and the two vertical dotted lines indicate the threshold of the COMP (0.023) and the contact area criteria (127.4 and 500.0 Å2), respectively. The lower figures in both a) and b) show an enlarged display of the region smaller than 1000.0 Å2. Some entries discussed here are marked with their PDBIDs.
Summary of the classification, the discrimination and the evaluation
| Classification
| Discrimination
| Evaluation
| ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Category | biological | crystal-packing | Number | % | biological | crystal-packing | non bio | OK | NG | Excluded |
| 1 | COMP/AREA | – | 255 | 90.4 | 236j | 0 | 19k | 235j/8k | 0j/10k | 1j/1k |
| 2 | COMP | AREA | 3 | 1.1 | 3 | 0 | 0 | 1 | 2 | 0 |
| 3 | AREA | COMP | 16 | 5.7 | 0 | 15l | 1m | 0l/0m | 14l/1m | 1l/0m |
| 4 | – | COMP/AREA | 8 | 2.8 | 0 | 8 | 0 | 3 | 4 | 1 |
| Total | 282 | 100 | 239 (84.8%) | 23 (8.2%) | 20 (7.1%) | 247 (88.8%) | 31 (11.2%) | 4 | ||
Notes: Biological contacts had largest COMP (COMP) and/or largest area (AREA), or did not have both largest COMP and area (−);
Crystal-packing contacts had largest COMP (COMP) and/or largest area (AREA), or did not have both largest COMP and area (−);
Number of the entries;
Number of the entries judged that the biological contact is the most probable biological interface;
Number of the entries judged that the crystal-packing contact is the most probable biological interface;
Number of the entries judged that both the biological and crystal-packing contacts are not biological;
Number of the entries where the discrimination result agreed with the (probable) actual biological state;
Number of the entries where the discrimination result disagreed with the (probable) actual biological state;
Number of the entries which were excluded from the estimation of the success rate in the evaluation step. In category 1, the numbers of entries with “j” or “k” in the Evaluation column come from those with “j” or “k” in the Discrimination column. In category 3, the numbers of entries with “l” or “m” in the Evaluation column come from those with “l” or “m” in the Discrimination column.
Summary of the evaluation results
| Category (Number | Discrimination | Evaluation | Bio-dimer | Result | Number |
|---|---|---|---|---|---|
| 1 | biological | No crystal-packing contact with the area > the first area criterion. | biological | OK | 177 |
| (255) | (236) | No crystal-packing contact with the COMP > the threshold. | biological | OK | 26 |
| Biological contact has a large area (>500.0 Å2). | biological | OK | 18 | ||
| Only biological contact meets only the second area criterion. | biological | OK | 7 | ||
| Biological contact is an actual biological interface based on the literature. | biological | OK | 7 | ||
| no literature (1pug) (excluded) | – | – | 1 | ||
| nonbio | Biological contact is not a largest interface in multimeric complex. | biological | NG | 7 | |
| (19) | The protein acts as a monomer. | nonbio | OK | 8 | |
| Biological unit is dimeric based on the literature. | biological | NG | 3 | ||
| Biological contact is a part of the biological dimer interface (1jy2). (excluded) | – | – | 1 | ||
| 2 | biological | Biological contact is a biological interface based on the literature (1 m 7 g). | biological | OK | 1 |
| (3) | (3) | The protein acts as a monomer. | nonbio | NG | 2 |
| 3 (16) | crystal-packing | Biological contact is a biological interface based on the literature (1jm0, etc.). | biological | NG | 10 |
| (15) | The protein acts as a monomer | nonbio | NG | 4 | |
| no literature (1o1h) (excluded) | – | – | 1 | ||
| nonbio (1) | Biological unit is dimeric based on the literature | biological | NG | 1 | |
| 4 (8) | crystal-packing (8) | Crystal-packing contact may be biologically relevant (1h6p, 1ex2, 1l6r). | crystal-packing | OK | 3 |
| Biological contact may be biologically relevant (1iu8). | biological | NG | 1 | ||
| no information about the biological assembly (1auv) (excluded) | – | – | 1 | ||
| The protein acts as a monomer. | nonbio | NG | 3 |
Notes: Number of entries;
The entries in the “biological” category were judged that the biological contact is the most probable biological interface in the discrimination step, on the other hand, those in the “crystal-packing” category were judged that the crystal-packing contact is the most probable biological interface. The entries in the “nonbio” category were judged that both biological and crystal-packing contacts are not biological;
The contact concluded as the (probable) actual biological contact in the evaluation step. The “nonbio” means that both biological and crystal-packing contacts are not biological;
OK: the discrimination result agreed with the actual biological state concluded in the evaluation. NG: the discrimination result disagreed with the actual biological state concluded in the evaluation. -: the entry was excluded from the estimation of the success rate.
Figure 5The dimer structures within the ASUs in 1d6j a) and 1 m 7 g b). The regions circled by the yellow lines indicate the N-terminal regions of one subunits in the both ASU dimers. The lower figures show the rotated dimers in the upper figures by 90 degrees around the two-fold axis. In the lower dimer of 1 m 7 g b), the interaction between the ASU subunit colored in blue and the subunit colored in white which exists in the adjacent cell to the center unit cell corresponds with the crystal-packing contact mentioned in the text.
Figure 6Two possible tetramers in the crystal of 1auv. In the upper figure, the left complex surrounded by the green line is the biological tetramer according to the primary citation of this entry, and the right one surrounded by the yellow line is another possibility. Both tetramers are tightly packed with each other in the crystal. The lower figures show the biological contacts in these two tetramers by the arrows having the same color as the line surrounding the corresponding tetramer. The green arrow with “2 (BIO)” represents the biological contact which has the second largest area in the left tetramer. The yellow arrow with “2 (CRY)” corresponds to the crystal-packing contact which is the second largest contact in the right tetramer and is also the crystal-packing contact formed between the left tetramer and the neighboring tetramer including the right half of the right tetramer on the left side. The arrows with “1” represent the contacts with the largest area in both tetramers; these two contacts can be similar.