| Literature DB >> 17937822 |
Rui-Sheng Wang1, Yong Wang, Ling-Yun Wu, Xiang-Sun Zhang, Luonan Chen.
Abstract
BACKGROUND: Domains are the basic functional units of proteins. It is believed that protein-protein interactions are realized through domain interactions. Revealing multi-domain cooperation can provide deep insights into the essential mechanism of protein-protein interactions at the domain level and be further exploited to improve the accuracy of protein interaction prediction.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17937822 PMCID: PMC2222654 DOI: 10.1186/1471-2105-8-391
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1An illustrative example for multi-domain interactions. (a) All multi-domain pairs are listed for two proteins P1 and P2. Proteins: P1 = {D, D, D}, P2 = {D, D} ; Domains: D, D, D, D, D; (b) The illustration of domain interactions by considering multi-domain pairs in the proposed model. There are one pair of interacting proteins and three pairs of non-interacting proteins. The bold line (red) represents interacting domain pair, while the dotted lines (green) are the deleted non-interacting domain pairs.
Superdomains detected by our method from MIPS protein interaction data, where GO annotations are denoted in italic
| Superdomains | Descriptions | GO similarity |
| PF00488, PF05192 | (1) MutS domain V, | 2-1-2-5-11-27-1-8-10 |
| PF02775, PF00205 | (1) Thiamine pyrophosphate enzyme, C-terminal TPP binding domain, | 1-1-2-12-1-8 |
| PF08033, PF04810 | (1) Sec23/Sec24 beta-sandwich domain | |
| PF03953, PF00091 | (1) Tubulin/FtsZ family, C-terminal domain, | |
| PF07687, PF01546 | (1) Peptidase dimerisation domain, | 1-1-3-16 |
| PF05000, PF00623 | (1) RNA polymerase Rpb1, domain 4, | 1-1-3-39-16-3-16 |
| PF08544, PF00288 | (1) GHMP kinases C terminal | |
| PF01798, PF08060 | (1) Putative snoRNA binding domain | |
| PF02800, PF00044 | (1) Glyceraldehyde 3-phosphate dehydrogenase, C-terminal domain, | 2-1-2-5-11-1-4-9-1-5-1 |
| PF08030, PF08022 | (1) Ferric reductase NAD binding domain |
Cooperative domains detected by our method from MIPS protein interaction data, where GO annotations are denoted in italic
| Cooperative domains (Interactor I) | Descriptions | Interactor II | Descriptions |
| PF00069, PF00786 | (1) Protein kinase domain, | PF00018 | SH3 domain, in a variety of proteins with enzymatic activity |
| PF00400, PF00646 | (1) WD domain, G-beta repeat, coordinating multi-protein complex assemblies | PF01466 | Skp1 family, dimerisation domain |
| PF00439, PF00176 | (1) Bromodomain, interacting specifically with acetylated lysine | PF04433 | SWIRM domain, mediating protein-protein interactions |
| PF00169, PF00787 | (1) PH domain | PF08632 | Sporulation protein Zds1 C terminal region, suppress the calcium sensitivity of Zds1 deletions |
| PF00069, PF00169 | (1) Protein kinase domain, | PF00018 | SH3 domain, in a variety of proteins with enzymatic activity |
| PF00018, PF00063 | (1) SH3 domain, in a variety of proteins with enzymatic activity | PF02205 | WH2 motif, actin-binding motif |
| PF00806, PF00076 | (1)Pumilio-family RNA binding repeat, | PF00501 | AMP-binding enzyme, |
| PF02985, PF03810 | (1) HEAT repeat, involved in intracellular transport processes | PF04096 | Nucleoporin autopeptidase, |
| PF02178, PF00271 | (1) AT hook motif, DNA binding motifs | PF00249 | Myb-like DNA-binding domain, |
Strongly cooperative domains detected by our method from MIPS protein interaction data, where GO annotations are denoted in italic
| Cooperative domains (Interactor I) | Descriptions | Interactor II | Descriptions |
| PF00618, PF00018 | (1) Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal motif, | PF00012 | Hsp70 protein, involved in different cellular compartments (nuclear, cytosolic, mitochondrial, endoplasmic reticulum, etc |
| PF01466, PF03931 | (1) Skp1 family, dimerisation domain | PF00646 | F-box domain, mediating protein-protein interactions |
| PF04998, PF00623 | (1) RNA polymerase Rpb1, domain 5, | PF01191 | RNA polymerase Rpb5, C-terminal domain, |
| PF00806, PF00076 | (1) Pumilio-family RNA binding repeat, | PF00660 | Seripauperin and TIP1 family, |
| PF00036, PF08226 | (1) EF hand, | PF07651 | ANTH domain, |
| PF00443, PF00581 | (1) Ubiquitin carboxyl-terminal hydro-lase, | PF00611 | Fes/CIP4 homology domain, |
| PF00620, PF00787 | (1) RhoGAP domain, | PF08632 | Sporulation protein Zds1 C terminal region, sporulation, suppress the calcium sensitivity of Zds1 deletions |
| PF01426, PF00439 | (1) BAH domain, DNA binding, involved in protein-protein interaction | PF00076 | RNA recognition motif, |
Figure 2Cooperative domains in the complex crystal structure formed by proteins P02994 (with ORFs: YBR118W, YPR080W) and P32471 (with ORF: YAL003W). Protein sequences are shown using thick gray lines, and Pfam domain annotations are shown using colored rectangular boxes and drawn to scale (based on the Pfam database). The names of the protein sequences in this protein complex are listed to the upper left of the domain architecture. The identified cooperative domain pairs are listed to the upper right of the domain architecture. The domain names are labeled by the same color as in the Pfam domain annotation. The cartoon of PDB crystal structure (PDB ID: 1f60, Crystal structure of the yeast elongation factor complex) demonstrates the cooperative domain interactions (where domain colors are consistent with the domain annotation), i.e. domain PF00736 in protein P32471 interacts physically with domains of protein P02994. Other complexes in PDB containing these cooperative domains are also listed by their matched PDB IDs and chain IDs.
Comparisons of several methods in terms of RMSE and training time on Ito's dataset
| EM | ASSOC | ASNM | |||
| Train | |||||
| 1st | 0.4693 | 0.4537 | 0.0486 | 0.0084 | 0.0077 |
| 2nd | 0.4810 | 0.4670 | 0.0486 | 0.0086 | 0.0079 |
| 3rd | 0.4746 | 0.4617 | 0.0508 | 0.0071 | 0.0060 |
| 4th | 0.4683 | 0.4545 | 0.0474 | 0.0076 | 0.0057 |
| 5th | 0.4676 | 0.4540 | 0.0493 | 0.0072 | 0.0066 |
| Average | 0.4722 | 0.4582 | 0.0489 | 0.0073 | 0.0068 |
| Time (seconds) | 6.6622 | 0.0090 | 0.003 | 1.099 | 0.007 |
| Test | |||||
| 1st | 0.6624 | 0.6072 | 0.0743 | 0.0224 | 0.0189 |
| 2nd | 0.4880 | 0.4938 | 0.0531 | 0.0104 | 0.0128 |
| 3rd | 0.5670 | 0.5338 | 0.0591 | 0.0425 | 0.0427 |
| 4th | 0.5848 | 0.5745 | 0.0641 | 0.0296 | 0.0271 |
| 5th | 0.6417 | 0.6308 | 0.0753 | 0.0354 | 0.0307 |
| Average | 0.5888 | 0.5680 | 0.0652 | 0.0281 | 0.0265 |
Comparisons of of several methods in term of RMSE and training time on Krogan's yeast extended dataset
| EM | ASSOC | ASNM | |||
| Train | |||||
| 1st | 0.4156 | 0.4525 | 0.4580 | 0.1262 | 0.1359 |
| 2nd | 0.4176 | 0.4521 | 0.4607 | 0.1248 | 0.1360 |
| 3rd | 0.4196 | 0.4548 | 0.4615 | 0.1291 | 0.1365 |
| 4th | 0.4178 | 0.4535 | 0.4585 | 0.1243 | 0.1337 |
| 5th | 0.4184 | 0.4546 | 0.4602 | 0.1256 | 0.1338 |
| Average | 0.4178 | 0.4535 | 0.4598 | 0.1260 | 0.1352 |
| Time (seconds) | 2699.9 | 0.2000 | 0.1968 | 118.21 | 6.5092 |
| Test | |||||
| 1st | 0.5504 | 0.5548 | 0.4931 | 0.3967 | 0.3588 |
| 2nd | 0.5390 | 0.5441 | 0.4906 | 0.3804 | 0.3407 |
| 3rd | 0.5372 | 0.5407 | 0.4822 | 0.3687 | 0.3372 |
| 4th | 0.5422 | 0.5364 | 0.4805 | 0.3854 | 0.3366 |
| 5th | 0.5333 | 0.5291 | 0.4747 | 0.3907 | 0.3455 |
| Average | 0.5404 | 0.5410 | 0.4842 | 0.3844 | 0.3437 |
Figure 3Comparisons of RMSE on two-domain pairs and on multiple-domain pairs for Krogan's yeast extended datasets. (a) The results of LPM on training. (b) The results of LPM on testing. (c) The results of APMM on training. (d) The results of APMM on testing.
Figure 4(a) ROC curve comparison of APMM and the extended EM on multiple-organism data. (b) ROC curve comparison of APMM based on two-domain pairs and multi-domain pairs.
The overlap of the predicted domain interactions by APMM and LPM with those in iPfam, where λ denotes domain interaction probability, 'Single organism' means the training set of protein interactions is only from yeast, 'Multiple organisms' means the training set is from three organisms: yeast, worm and fly
| Thresholds | Single organism ( | Multiple Organisms ( |
| APMM | ||
| λ | 110 (8.1e-009) | 256 (8.4e-012) |
| λ | 99 ( | 202 (2.9e-011) |
| λ | 61 (9.8e-010) | 149 ( |
| λ | 52 (3.2e-010) | 127 (8.1-013) |
| λ | 49 (1.5e-013) | 91 (3.3e-012) |
| LPM | ||
| λ | 109 (2.1e-008) | 256 (5.7e-013) |
| λ | 97 (8.8e-013 | 201 (5.9e-013) |
| λ | 61 ( | 148(2.9e-012) |
| λ | 54 (4.4e-011) | 130 ( |
| λ | 49 ( | 93 ( |
The overlap of the predicted domain interactions by APMM and LPM with those in InterDom, where λ denotes domain interaction probability
| Thresholds | Total domain pairs | InterDom overlap ( | Mean significance |
| APMM | |||
| λ | 26407 | 8085 (8.0e-012) | 75.3815 |
| λ | 16798 | 5834 (1.9e-011) | 96.3991 |
| λ | 9416 | 3749 ( | 124.9704 |
| λ | 7582 | 3125 (1.5e-012) | 140.3715 |
| λ | 4349 | 1800 ( | 170.5464 |
| LPM | |||
| λ | 26326 | 8086 (1.7e-011) | 75.3558 |
| λ | 16854 | 5844 ( | 96.6538 |
| λ | 9424 | 3753 ( | 124.8600 |
| λ | 7561 | 3101 (6.1e-012) | 140.8661 |
| λ | 4322 | 1788 (1.3e-012) | 171.6955 |
Figure 5Mean confidence score of the predicted domain interactions (by APMM) at different domain interaction thresholds respectively based on single-organism data and multiple-organism data.
Figure 6The overlaps of domain-domain interactions predicted by APMM, DPEA and PE with iPfam.
Figure 7The distribution of the predicted DDI overlaps with iPfam by DPEA, PE and APMM.
Figure 8Reconstruction of DNA-directed RNA polymerase complex. (a) The RNA Polymerase II-TFIIS complex (PDB ID 1y1v) with 13 subunits (from chain A to chain S). Every chain is one protein (shown by their UniProtKB accessions) and their complex interactions form the large polymer. (b) The PfamA domain architecture for every protein. (c) The cooperative domains identified by our method with protein interaction pairs containing them. The red or blue colors of proteins and domains indicate their memberships.