| Literature DB >> 21152297 |
Bin Xue1, A Keith Dunker, Vladimir N Uversky.
Abstract
Many cell functions in all living organisms rely on protein-based molecular recognition involving disorder-to-order transitions upon binding by molecular recognition features (MoRFs). A well accepted computational tool for identifying likely protein-protein interactions is sequence alignment. In this paper, we propose the combination of sequence alignment and disorder prediction as a tool to improve the confidence of identifying MoRF-based protein-protein interactions. The method of reverse sequence alignment is also rationalized here as a novel approach for finding additional interaction regions, leading to the concept of a retro-MoRF, which has the reversed sequence of an identified MoRF. The set of retro-MoRF binding partners likely overlap the partner-sets of the originally identified MoRFs. The high abundance of MoRF-containing intrinsically disordered proteins in nature suggests the possibility that the number of retro-MoRFs could likewise be very high. This hypothesis provides new grounds for exploring the mysteries of protein-protein interaction networks at the genome level.Entities:
Keywords: PONDR-RIBS; alignment; intrinsic disorder; invert; retro; reverse
Mesh:
Substances:
Year: 2010 PMID: 21152297 PMCID: PMC2996789 DOI: 10.3390/ijms11103725
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Disorder and MoRF prediction for (a) RNase E, (b) p53, and (c) SRC-3. The thin solid lines are prediction of PONDR-VLXT, dotted lines are prediction from PONDR-FIT, and horizontal bold lines are the MoRF regions identified by MoRF-II predictor. In subset (a), N and C1–C4 correspond to one N-terminal dip and four C-terminal dips of RNase E. In subset (b), N, C1, and C2 stand for N-terminal dip and two C-terminal dips for p53. In (c), there are one N-terminal dip N, two middle-region dips M1 and M2, and two C-terminal dips C1 and C2.
MoRFs of three proteins and their alignment matches in PDB and SwissProt.
| Protein | MoRF | Proteins in PDB containing similar MoRF | Proteins in SwissProt containing similar MoRF | |||
|---|---|---|---|---|---|---|
| SwissProt id | Species | Name | Within IDR | |||
| RNase E | N | --- | Q9R5Y8 | Cell shape determining protein | Yes | |
| A5UA75 | Haemophilus influenzae | Hydroxyethylthiazole kinase | Yes | |||
| A4NVQ3 | Haemophilus influenzae | rRNA pseudouridylate synthase C | Yes | |||
| Q65S31 | Mannheimia succiniciproducens | CafA protein | Yes | |||
| C1 | --- | --- | --- | --- | ||
| C2 | --- | B0U5Z2 | Xylella fastidiosa | Glutamyl-tRNA reductase | Yes | |
| Q65I31 | Bacillus licheniformis | Anthranilate synthaseTrpE | Yes | |||
| C3 | --- | A5UA75 | Haemophilus influenzae | Hydroxyethylthiazole kinase | Yes | |
| A4NVQ3 | Haemophilus influenzae | rRNA pseudouridylate synthase C | No | |||
| Q65S31 | Mannheimia succiniciproducens | CafA protein | No | |||
| C4 | --- | --- | --- | --- | ||
| p53 | N | --- | A0M1H7 | Gramella forsetii | Carbohydrate kinase | Yes |
| --- | B9RU24 | Ricinus communis | Mitochondrial respiratory chain complexes assembly protein, putative | |||
| C1 | --- | C6Y295 | Pedobacter heparinus | DNA polymerase III, α subunit | No/Yes | |
| C2 | --- | --- | --- | --- | ||
| SRC-3 | N | --- | --- | --- | --- | |
| M1 | --- | Q6NSP2 | Zebrafish | Rho/rac guanine nucleotide exchange factor (GEF) | Yes | |
| B7JAA3 | Acidithiobacillus ferrooxidans | Nif-specific regulatory protein | Yes | |||
| B5ER80 | Acidithiobacillus ferrooxidans | Transcriptional regulator, NifA, Fis Family | Yes | |||
| A6VPZ1 | Actinobacillus succinogenes | Sulfite reductase [NADPH] hemoprotein beta-component | No | |||
| Q01FQ6 | Ostreococcus tauri | CLP protease regulatory subunit CLPX (ISS) | Yes | |||
| A4RRW1 | Ostreococcus lucimarinus | Mitochondrial ClpX chaperone | Yes | |||
| M2 | --- | Q6N6F5 | Rhodopseudomo nas palustris | ATP-dependent DNA helicase | Yes | |
| C1AT82 | Rhodococcus opacus | Hypothetical membrane protein | No | |||
| B3QIU1 | Rhodopseudomonas palustris | DEAD/DEAH box helicase domain protein | Yes | |||
| C1 | --- | --- | --- | --- | ||
| C2 | --- | --- | --- | --- | ||
Only proteins different from the original protein and its family are listed.
Figure 2Disorder prediction and sequence alignment for proteins shown in Table 1, which are alignment matches of all the MoRF regions of three proteins in our study. The alignment was cut off at E-value of 0.001. The disorder prediction was implemented by PONDR®VL-XT. The partially sequence alignment is shown as the inset. The insertions in the original alignment were deleted for matching the curve of disorder prediction. The curves of disorder score were shifted to overlap the aligned segments. The N, C2, and C3 MoRF regions of RNase E are shown in (a), (b), and (c), respectively. (d) and (e) are the N and C1 MoRF regions of P53. (f) and (g) are the M1 and M2 MoRF regions of SRC-3.
Reversed segments of MoRFs of three proteins and their alignment matches in PDB and SwissProt.
| Protein | MoRF | Proteins in PDB containing similar MoRF | Proteins in SwissProt containing similar MoRF | |||
|---|---|---|---|---|---|---|
| SwissProt id | Species | Name | Within IDR | |||
| RNase E | rN | --- | B8EIZ0 | Methylocella silvestris | Glycosyl transferase family 2 | Yes |
| rC1 | --- | A0RVW2 | Cenarchaeum symbiosum | Putative uncharacterized protein | Yes | |
| rC2 | --- | A8IXM6 | Chlamydomonas reinhardtii | Dopamine beta-monooxygerase-like protein | Yes | |
| C0VZ26 | Actinomyces coleocanis | 30S ribosomal protein S5 | Yes | |||
| Q2C7B7 | Photobacterium | Pseudouridine synthase | Yes | |||
| Q7KA80 | Drosophila melanogaster | Heterogeneous nuclear ribonucleoprotein | Yes | |||
| A1ZBW0 | Drosophila melanogaster | Bancal isoform C | Yes | |||
| rC3 | --- | --- | --- | --- | ||
| rC4 | --- | B9LNU7 | Halorubrum lacusprofundi | Manganese containing catelase | Yes | |
| p53 | rN | --- | --- | --- | --- | |
| rC1 | --- | B8F7I3 | Haemophilus parasuis serovar 5 | tRNA modification GTPase TrmE | Yes | |
| Q0QE22 | Haemophilus parasuis | ThdF | Yes | |||
| A8FF31 | Bacillus pumilus | 3-dehydroquinate dehydratase | No | |||
| rC2 | --- | --- | --- | --- | ||
| SRC-3 | rN | --- | B6R7C1 | Pseudovibrio | Outer surface protein | No |
| Q0SCR7 | Rhodococcus | Aldehyde dehydrogenase | Yes | |||
| C1B3Y6 | Rhodococcus opacus | Phenylacetic acid degradation protein PaaN | Yes | |||
| rM1 | --- | C5PX68 | Sphingobacterium spiritivorum | Conserved hypothetical transmembrane protein | No | |
| rM2 | --- | Q880S9 | Pseudomonas syringae | AraC-family transcriptional regulator | No | |
| Q1DCP2 | Myxococcus xanthus | Tetratricopeptide repeat protein | No/Yes | |||
| Q8NLQ1 | Corynebacterium glutamicum | UDP-galactopyranose mutase | No | |||
| A4CL95 | Robiginitalea biformata | Type III restriction enzyme | No | |||
| rC1 | --- | --- | --- | --- | ||
| rC2 | --- | Q5AHB1 | Candida albicans | Actin cytoskeleton-regulatory complex protein PAN1 | Yes | |
“r” stands for reversed segment.
Only proteins other than original proteins and its family are included.
Figure 3Disorder prediction and sequence alignment for proteins in Table 2, which are reverse alignment matches of all the MoRF regions of three proteins. The alignment was also cut off at E-value of 0.001. Disorder scores were predicted by PONDR®VL-XT. In the insets of sequence alignment, the sequential order of original MoRF regions were shown in a reversed order, while other alignment hits were shown in normal sequential order. The curves of disorder score were also shifted to allowed the overlapped of aligned segments. N, C1, C2, and C4 MoRF regions of RNase E were shown in (a)–(d), accordingly. C1 MoRF region of P53 was presented in (e). N, M1, M2, and C2 MoRF regions of SRC-3 were plotted in (e)–(i), respectively.