| Literature DB >> 28394890 |
Lenna X Peterson1, Amitava Roy1,2,3, Charles Christoffer4, Genki Terashi1,5, Daisuke Kihara1,4.
Abstract
Disordered protein-protein interactions (PPIs), those involving a folded protein and an intrinsicallyEntities:
Mesh:
Substances:
Year: 2017 PMID: 28394890 PMCID: PMC5402988 DOI: 10.1371/journal.pcbi.1005485
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Existing peptide-protein complex modeling methods.
| Method | Category | Availability | Requires binding site | Initial peptide conformation | Tested (max) amino acids |
|---|---|---|---|---|---|
| Docking | No | No | TINKER [ | 4 | |
| Docking | No | Yes | Bound conformation | 16 | |
| Rosetta FlexPepDock ab-initio [ | Docking | Yes | Yes | Predicted fragments | 15 (30) |
| HADDOCK [ | Docking | Yes | Yes | 15 | |
| pepATTRACT [ | Docking | Yes | No | 15 | |
| CABS-DOCK [ | Docking | Yes | No | Random | 15 (30) |
| MdockPeP [ | Docking | No | No | Sequence-based search | 15 |
| DynaDock [ | MD | No | Yes | Bound conformation | 16 |
| MD | No | No | Bound conformation | 13 | |
| AnchorDock [ | MD | No | No | Extended/MD | 15 |
a: Tested is the longest peptide in the published test set and max is the maximum length allowed by the web server.
Fig 1IDP-LZerD consists of four steps.
1. fragment structure prediction, 2. fragment docking, 3. path assembly, and 4. refinement. Steps 1 and 2 correspond to “dock” and Steps 3 and 4 correspond to “coalesce.”
Disordered protein complex data set.
| Disordered protein name | Receptor protein name | Bound | Unbound | DisProt ID or ref. | |||
|---|---|---|---|---|---|---|---|
| Receptor PDB ID | Ligand chain | L | Receptor PDB ID | Pocket RMSD (Å) | |||
| P53, transactivation domain | MDM2, N-terminal domain | 1ycrA | B | 15 | 1z1mA | 2.93 | DP00086 |
| Myelin basic protein | MHC class II antigen DRA/DRB5 | 1fv1AB | C | 20 | 4ah2AB | 0.91 | DP00237 |
| eIF4E-binding protein 1 | eukaryotic initiation factor 4E | 1wkwA | B | 20 | 1ipbA | 0.78 | DP00028 |
| Protein kinase inhibitor | PKA C- | 2cpkE | I | 20 | 1j3hA | 4.57 | DP00015 |
| c-Myb | Cbp/p300, KIX domain | 1sb0A | B | 25 | 4i9oA | 2.80 | [ |
| Cibulot | 1sqkA | B | 25 | 1ijjA | 0.79 | [ | |
| Bcl2-associated Antagonist of cell Death (BAD) | Bcl2-like protein 1 (Bcl2-L-1) | 2bzwA | B | 27 | 1pq0A | 3.00 | DP00563 |
| Regulatory protein SIR3 | DNA-binding protein RAP1 | 3owtAB | C | 27 | 3cz6AB | 1.30 | DP00533 |
| hSARA, SMAD2-binding domain | hSMAD2 | 1devA | B | 41 | 1khxA | 3.94 | DP00141 |
| Cbp/p300-interacting transactivator 2 (CITED2) | Cbp/p300, TAZ1 domain | 1p4qB | A | 44 | 1l3eB | 5.11 | DP00356 |
| Transcription factor 7-like 2 (TCF7L2) | 1jpwA | D | 45 | 2z6hA | 0.98 | DP00175 | |
| Hypoxia-inducible factor 1- | Cbp/p300, TAZ1 domain | 1l8cA | B | 51 | 1u2nA | 2.87 | DP00262 |
| Nucleoporin NUP2 | Importin subunit | 2c1tA | C | 51 | 1bk5A | 1.44 | DP00222 |
| Synaptosomal-associated protein 25, SNARE domain | Botulinum neurotoxin type A (BoNT/A) | 1xtgA | B | 59 | 1xtfA | 4.24 | DP00068 |
a: removed residues 1 to 24;
b: removed chain B engineered residues -30 to 0;
c: removed the stabilizing small molecule KI1 (1-4-[4-chloro-3-(trifluoromethyl)phenyl]-4-hydroxypiperidin-1-yl-3-sulfanylpropan-1-one);
d: superimposed 2 copies of 3cz6A onto 3owtAB;
e: both chains A and B of 1p4q are disordered, so to create an unbound receptor for 1p4qA from 1l3eBA, we removed chain A, which has a different sequence than 1p4qA;
f: removed homodimer.
Disordered protein complex test set.
| Disordered protein name | Receptor protein name | Bound | Unbound | DisProt ID or ref. | |||
|---|---|---|---|---|---|---|---|
| Receptor PDB ID | Ligand chain | L | Receptor PDB ID | Pocket RMSD (Å) | |||
| Peroxisomal targeting signal 1 receptor | PEX14 | 2w84A | B | 20 | 5aonA | 1.19 | DP00472 |
| CDK inhibitor 1 | Proliferating cell nuclear antigen | 1axcA | B | 22 | 1vymA | 1.86 | DP00016 |
| Alpha trans-inducing protein | Transcriptional coactivator PC4 | 2pheAB | C | 26 | 1pcfAB | 1.87 | DP00087 |
| Protease A inhibitor 3 | Proteinase A | 1g0vA | B | 31 | 1fmxA | 3.80 | DP00179 |
| Nuclear factor erythroid 2-related factor 2 | Keap1 | 3wn7A | B | 35 | 1x2jA | 0.90 | DP00968 |
| Protein phosphatase 1 regulatory subunit 12A | PP-1B | 1s70A | B | 39 | 4ut2A | 0.92 | DP00218 |
| Protein phosphatase inhibitor 2 | PP-1G | 2o8gA | I | 40 | 1jk7A | 1.45 | DP00815 |
| Outer membrane virulence protein YopE | YopE chaperone SycE | 1l2wAB | I | 69 | 1jyaAB | 1.27 | [ |
a: template-based model using MODELLER [43] (5aonA was used as the template, which has 46.9% sequence identity to 2w84A).
Secondary structure prediction accuracy.
| Method | Accuracy |
|---|---|
| JPred | 66.4% |
| Porter | 81.2% |
| PSIPRED | 69.7% |
| SSpro | 75.4% |
| All | 57.0% |
| Best | 86.1% |
Accuracy: percentage of all residues correctly predicted. Secondary structure classes were assigned using DSSP [48]. DSSP classes GHI are considered H, EB are considered E, and all others are considered C. All: all four methods predict the correct class. Best: at least one of the four methods predicts the correct class. Computed using 1ycrB, 1fv1C, 1wkwB, 2cpkI, 1sb0B, 1sqkB, 2bzwB, 3owtC, 1devB, 1l8cB, and 1xtgB.
9-residue IDR complex test set selected from ELM.
| Disordered protein name | Receptor protein name | Bound | Unbound | DisProt ID | ELM ID | ||
|---|---|---|---|---|---|---|---|
| Receptor PDB ID | Ligand chain | First res | Receptor PDB ID | ||||
| Cyclin-dependent kinase inhibitor 1B | CDK2/Cyclin A | 1jsuAB | C | 25 | 2c5nAB | DP00018 | - |
| 34 | DP00018 | - | |||||
| 43 | DP00018 | - | |||||
| 52 | DP00018 | ELMI000069 | |||||
| PIFtide | Protein kinase Akt-2 | 1o6lA | A | 469 | 1gzkA | DP00304 | ELMI001633 |
| Glycogen synthase kinase-3 | Protein kinase Akt-2 | 1o6lA | C | 4 | 1gzkA | DP00385 | - |
| Protein phosphatase 1 regulatory subunit 12A | PP-1B | 1s70A | B | 1 | 4ut2A | DP00218 | - |
| 10 | DP00218 | ELMI002747 | |||||
| 22 | DP00218 | - | |||||
| 31 | DP00218 | ELMI001397 | |||||
| Peroxisomal targeting signal 1 receptor | PEX14 | 2w84A | B | 101 | 5aonA | DP00472 | ELMI002213 |
a: template-based model using MODELLER [43] (5aonA was used as the template, which has 46.9% sequence identity to 2w84A).
Fragment modeling and docking accuracy for 9-residue IDR complexes from ELM.
| Bound PDB ID | First res | Minimum RMSD (Å) | Unbound PDB ID | Minimum RMSD (Å) | |||
|---|---|---|---|---|---|---|---|
| Fragments | All docked | Selected docked | All docked | Selected docked | |||
| 1jsuAB | 25 | 1.8 | 3.5 | 3.5 | 2c5nAB | 3.2 | 3.2 |
| 1jsuAB | 34 | 1.4 | 2.8 | 9.5 | 2c5nAB | 3.2 | 9.2 |
| 1jsuAB | 43 | 0.5 | 1.6 | 1.9 | 2c5nAB | 1.6 | 2.6 |
| 1jsuAB | 52 | 0.6 | 1.6 | 3.1 | 2c5nAB | 3.1 | 3.1 |
| 1o6lA | 4 | 2.1 | 3.9 | 3.9 | 1gzkA | 3.6 | 4.6 |
| 1o6lA | 469 | 2.9 | 5.1 | 5.7 | 1gzkA | 5.5 | 5.5 |
| 1s70A | 1 | 1.3 | 3.5 | 9.1 | 4ut2A | 3.5 | 8.3 |
| 1s70A | 10 | 0.4 | 3.1 | 3.1 | 4ut2A | 2.6 | 2.6 |
| 1s70A | 22 | 2.4 | 3.6 | 7.5 | 4ut2A | 3.3 | 5.8 |
| 1s70A | 31 | 1.6 | 3.8 | 4.1 | 4ut2A | 3.3 | 4.5 |
| 2w84A | 101 | 0.4 | 2.8 | 2.8 | 5aonA | 2.0 | 2.0 |
| Average | 1.4 | 3.2 | 4.9 | 3.2 | 4.7 | ||
First res.: The first amino acid position of the 9-residue long fragments in the protein. Fragments: minimum backbone RMSD of predicted fragments to native. All docked: minimum L-RMSD of all docked fragments (has a lower bound of the fragment RMSD). Selected docked: minimum L-RMSD of top 4,500 fragments by DI score (Z(DFIRE) + Z(ITScorePro)).
Fig 2Correlation between the docking score (DI) and the RMSD of the fragments.
Data for sequence windows 1 and 2 of 4i9o. Green: top 30 docked fragments by DI score.
Fig 3Selection of correct IDP conformation with Path Score.
Hits: number of models with IDP RMSD < 6 Å in top 10 by Path Score. Blue: bound; red: unbound. Top: training complexes; bottom: testing complexes.
Fig 4Complex between Bcl2-L-1 and BAD.
(A): A model of the bound structure (2bzw) before (purple) and after (orange) refinement vs. native (green). (B): Unbound (1pq0); blue-to-red (N-terminus on the left): native BAD; rainbow: top 7 models of BAD.
Fig 5L-RMSD vs Model Score and IDP RMSD.
Inc: incorrect; Acc: acceptable; Med: medium. PDB ID: 2bzw.
Summary of modeling performance on training set.
| Bound | Unbound | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| PDB ID | L | RFH | RFH-B | BF10 | PDB ID | RFH | RFH-B | BF10 | ||
| 1ycrA | 15 | 1 (1) | 1 (1) | 0.42 | 1.00 | 1z1mA | 6 (320) | 6 (316) | 0.13 | 0.85 |
| 1fv1AB | 20 | 6 | 6 | 0.31 | 0.85 | 4ah2AB | 1 | 1 | 0.40 | 0.90 |
| 1wkwA | 20 | 16 | 15 | 0.39 | 0.45 | 1ipbA | 53 | 53 | 0.24 | 0.60 |
| 2cpkE | 20 | 4 (4) | 3 (3) | 0.56 | 1.00 | 1j3hA | 15 | 15 | 0.17 | 0.35 |
| 1sb0A | 25 | 3 | 3 | 0.32 | 1.00 | 4i9oA | 136 | 134 | 0.18 | 0.40 |
| 1sqkA | 25 | 14 | 14 | 0.36 | 0.24 | 1ijjA | 9 (63) | 9 (63) | 0.55 | 0.92 |
| 2bzwA | 27 | 1 (1) | 1 (1) | 0.49 | 1.00 | 1pq0A | - | - | - | 0.22 |
| 3owtAB | 27 | 6 | 5 | 0.33 | 0.90 | 3cz6AB | 52 | 50 | 0.13 | 0.35 |
| 1devA | 41 | 2 | 2 | 0.60 | 0.80 | 1khxA | 16 | 16 | 0.22 | 0.59 |
| 1p4qB | 44 | 5 | 5 | 0.27 | 0.82 | 1l3eB | 3 | 3 | 0.25 | 0.86 |
| 1jpwA | 45 | 1 (17) | 1 (17) | 0.38 | 0.92 | 2z6hA | 2 | 2 | 0.23 | 0.83 |
| 1l8cA | 51 | 33 (121) | 32 (118) | 0.26 | 0.57 | 1u2nA | 16 | 16 | 0.32 | 0.71 |
| 2c1tA | 51 | - | - | - | 0.06 | 1bk5A | - | - | - | 0.06 |
| 1xtgA | 59 | 5 | 3 | 0.17 | 0.61 | 1xtfA | - | - | - | 0.24 |
RFH: rank of first acceptable (medium) hit; RFH-B: rank of first acceptable (medium) hit pre-filtered with BindML (S4 Fig); f: fraction of native contacts for the first acceptable hit. BF10: in top 10, highest fraction of ligand C atoms with L-RMSD ≤ 10 Å. Acceptable and medium defined in S1 Table.
Summary of performance on test set.
| Bound | Unbound | |||||||
|---|---|---|---|---|---|---|---|---|
| PDB ID | L | RFH | BF10 | PDB ID | RFH | BF10 | ||
| 2w84A | 20 | 3 (35) | 0.54 | 0.90 | 5aonA | 6 (40) | 0.19 | 0.30 |
| 1axcA | 22 | 104 | 0.18 | 0.28 | 1vymA | 81 | 0.17 | 0.39 |
| 2pheAB | 26 | 11 | 0.25 | 0.23 | 1pcfAB | 15 | 0.29 | 0.23 |
| 1g0vA | 31 | 1 (1) | 0.65 | 1.00 | 1fmxA | 1 (4) | 0.29 | 1.00 |
| 3wn7A | 35 | 111 | 0.15 | 0.00 | 1x2jA | 343 | 0.28 | 0.13 |
| 1s70A | 39 | 252 | 0.31 | 0.33 | 4ut2A | - | - | 0.08 |
| 2o8gA | 40 | 17 | 0.32 | 0.60 | 1jk7A | 37 | 0.19 | 0.35 |
| 1l2wAB | 69 | 321 | 0.25 | 0.70 | 1jyaAB | 2 | 0.21 | 0.74 |
RFH: rank of first acceptable (medium) hit; f: fraction of native contacts for the first acceptable hit. BF10: in top 10, highest fraction of ligand C atoms with L-RMSD ≤ 10 Å. Acceptable and medium defined in S1 Table.
Fig 6Examples of successful bound and unbound cases.
Green: native IDP; orange: modeled IDP. a-d: bound cases; e-h: unbound cases. a: Rank 1 model of MDM2 with bound P53 (PDB ID: 1ycr). f 0.42, I-RMSD 1.48 Å, L-RMSD 3.60 Å (medium quality). b: Rank 4 model of PKA C-α with bound protein kinase inhibitor α (2cpk). f 0.56, I-RMSD 1.95 Å, L-RMSD 4.41 Å (medium quality). c: Rank 6 model of RAP1 with bound SIR3 (3owt). f 0.33, I-RMSD 3.30 Å, L-RMSD 6.02 Å. d: Rank 5 model of BoNT/A with bound SNAP-25 (1xtg). f 0.17, I-RMSD 3.79 Å, L-RMSD 9.22 Å. e: Rank 1 model of DRA/DRB5 with unbound myelin basic protein (4ah2). f 0.39, I-RMSD 2.46 Å, L-RMSD 5.83 Å. f: Rank 9 model of α-actin-1 with unbound Cibulot (1ijj). f 0.55, I-RMSD 2.51 Å, L-RMSD 5.15 Å. g: Rank 3 model of Cbp/p300 with unbound CITED2 (1l3e). f 0.25, I-RMSD 6.31 Å, L-RMSD 7.43 Å. h: Rank 2 model of SycE with unbound YopE (1jya). f 0.21, I-RMSD 5.44 Å, L-RMSD 9.97 Å.
Performance comparison of IDP-LZerD to CABS-dock and pepATTRACT on ≤ 30 amino acid IDPs.
| Bound | Top 10 hits | Unbound | Top 10 hits | |||||
|---|---|---|---|---|---|---|---|---|
| PDB ID | L | CABS-dock | pepATTRACT | IDP-LZerD | PDB ID | CABS-dock | pepATTRACT | IDP-LZerD |
| 1ycrA | 15 | 4 | 7/4 | 5/4 | 1z1mA | 4 | - | 2 |
| 1fv1AB | 20 | 2 | 1 | 1 | 4ah2AB | 1 | 2 | 7 |
| 1wkwA | 20 | - | - | - | 1ipbA | - | - | - |
| 2cpkE | 20 | 2 | - | 1/1 | 1j3hA | - | - | - |
| 2w84A | 20 | 3/1 | - | 2 | 5aonA | 6/1 | - | 1 |
| 1axcA | 22 | - | 1 | - | 1vymA | - | - | - |
| 1sb0A | 25 | 1 | - | 3 | 4i9oA | - | - | - |
| 1sqkA | 25 | - | - | - | 1ijjA | - | - | 1 |
| 2pheAB | 26 | 1 | - | - | 1pcfAB | 2 | - | - |
| 2bzwA | 27 | - | 7/5 | 1pq0A | - | - | ||
| 3owtAB | 27 | - | - | 1 | 3cz6AB | - | - | - |
| Total hits | 6/1 | 3/1 | 7/3 | 4/1 | 1 | 4 | ||
Table only includes complexes with IDPs up to 30 amino acids because the CABS-dock web server has a maximum length of 30 residues. n/a indicates that pepATTRACT did not run due to missing receptor residues.—indicates no hits in the top 10.
** indicates medium-quality hits. For example, 5/4** indicates that out of the top 10 models, 5 acceptable models were produced, among which 4 of them had medium quality. The CABS-dock web server outputs 10 models and pepATTRACT outputs 50 models (results are shown for the first 10).
Performance of IDP-LZerD on ≥ 11 amino acid protein-peptide complexes from MD test sets.
| Unbound | Anchordock | Dagliyan | IDP-LZerD | |||||
|---|---|---|---|---|---|---|---|---|
| PDB ID | L | Rank | RMSD (Å) | RMSD (Å) | Rank | RMSD (Å) | ||
| 2am9 | 15 | 14 | 2.2 | 0.81 | - | - | - | |
| 1jbe | 15 | 3 | 1.5 | 0.82 | 10.5 | 1 (83) | 8.9 (6.2) | 0.23 (0.64) |
| 2j2i | 14 | - | - | - | 9 (42) | 8.2 (4.7) | 0.13 (0.31) | |
| 1oot | 12 | 3 | 1.7 | 0.77 | 1 (295) | 7.5 (3.6) | 0.30 (0.70) | |
| 2aa2 | 12 | 1 | 2.0 | 0.81 | 306 | 5.0 | 0.28 | |
| 1i7g | 12 | 4 | 2.2 | 0.73 | 1 (11) | 6.0 (4.39) | 0.28 (0.39) | |
| 1b9k | 12 | - | - | - | - | - | - | |
| 1rwz | 11 | 6 | 1.3 | 0.74 | 5.77 | 3 | 6.9 | 0.26 |
a: Values from Table 2 in [19];
b: values from Table 1 in [18].
For IDP-LZerD, results are shown for the first acceptable (medium) hit. Dash (-) indicates no hits; n/a indicates that the complex was not part of the dataset. All RMSD values are for ligand backbone atoms.
Fig 7Biological case studies.
A: β-catenin in complex with TCF7L2. Green: native TCF7L2; orange: rank 1 model of TCF7L2; f 0.38, I-RMSD 2.85 Å, L-RMSD 7.94 Å. PDB ID: 1jpw. B-E: Human and mouse Cbp/p300 TAZ1 domain in complex with CITED2 and Hif-1α. Green/cyan: native CITED2/Hif-1α; orange/yellow: model CITED2/Hif-1α. Ball and stick: LPXL motif. B-C: Human TAZ1 and CITED2. B: bound (1p4qB); rank 5 model; f 0.27, I-RMSD: 4.2 Å, L-RMSD: 7.6 Å. C: unbound (1l3eB); rank 9 model; f 0.17, I-RMSD: 7.1 Å, L-RMSD: 9.6 Å. D-E: Mouse TAZ1 and Hif-1α. D: bound (1l8cA); rank 16 model; f 0.05, I-RMSD 11.7 Å, L-RMSD 20.1 Å. E: unbound (1u2nA); rank 9 model; f 0.20, I-RMSD 6.4 Å, L-RMSD 10.4 Å. F: Unbound complex between BoNT/A and sn2. Green: native sn2; orange: rank 1 model of sn2; f 0.00, I-RMSD 15.7 Å, L-RMSD 38.2 Å. Receptor PDB ID 1xtfA (unbound).
Fig 8Fragment geometry subject to cutoffs.
Midpoint distance: between the C atoms of the middle residues of two fragments; overlap distance: between the C atoms of the residues before and after the overlapping residues; overlap pair distance: between the corresponding N, C, C, and C atoms of the three overlapping residues; overlap angle: formed by the vectors from the N atom of the first overlapping residue to the C atom of the third overlapping residue.