| Literature DB >> 25210740 |
Settu Sridhar1, Kunchur Guruprasad1.
Abstract
We have carried out a systematic computational analysis on a representative dataset of proteins of known three-dimensional structure, in order to evaluate whether it would possible to 'swap' certain short peptide sequences in naturally occurring proteins with their corresponding 'inverted' peptides and generate 'artificial' proteins that are predicted to retain native-like protein fold. The analysis of 3,967 representative proteins from the Protein Data Bank revealed 102,677 unique identical inverted peptide sequence pairs that vary in sequence length between 5-12 and 18 amino acid residues. Our analysis illustrates with examples that such 'artificial' proteins may be generated by identifying peptides with 'similar structural environment' and by using comparative protein modeling and validation studies. Our analysis suggests that natural proteins may be tolerant to accommodating such peptides.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25210740 PMCID: PMC4161436 DOI: 10.1371/journal.pone.0107647
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
INVPEPs in 3967 representative protein chains from the Protein Data Bank (PDB).
| INVPEP sequence Length (number of amino acid residues) | Number of non-redundant INVPEPs observed in representative PDB dataset | Number of INVPEPs of known secondary structure conformation | Number of INVPEPs for which solvent accessibility calculated | Number of INVPEPs with difference in solvent accessibility values ≤1 Å2 | Number of unique INVPEPS in non-redundant PDB dataset | Number of unique PDB_IDs representing the INVPEPS |
| Five | 148313 | 125397 | 124921 | 3520 | 93493 | 3760 |
| Six | 9005 | 5331 | 5297 | 168 | 8478 | 3495 |
| Seven | 10460 | 460 | 454 | 15 | 577 | 1378 |
| Eight | 995 | 40 | 40 | 1 | 91 | 493 |
| Nine | 1279 | 2 | 2 | 1 | 24 | 245 |
| Ten | 607 | 1 | 1 | 0 | 9 | 171 |
| Eleven | 107 | 0 | 0 | 0 | 3 | 109 |
| Twelve | 1 | 1 | 1 | 0 | 1 | 1 |
| Eighteen | 1 | 0 | 0 | 0 | 1 | 2 |
Examples of protein pairs containing inverted peptides with ‘similar structural environment’.
| S.No. | PDB_ID (protein 1) | Location in protein 1 (peptide 1) | Sequence (peptide 1) | Solvent accessibility (peptide 1) | Number of neighbourhood residues (peptide 1) | Secondary structure (peptide 1) | PDB_ID (protein 2) | Location in protein (peptide 2) | Sequence (peptide 2) | Solvent accessi-bility (peptide 2) | Number of neighbourhood residues (peptide 2) | Secondary structure (peptide 2) |
| 1 | 3BGY:A | 146–150 | STEEI | 31.76 | 8 | EEEEE | 3FWK:A | 121–125 | IEETS | 31.58 | 10 | HHHHH |
| 2 | 2PKH:H | 125–129 | ERALA | 72.3 | 5 | HHHHH | 3E1I:A | 168–172 | ALARE | 71.6 | 4 | CCCCC |
| 3 | 2OC5:A | 180–185 | LEANRE | 64.4 | 8 | HHHHHH | 3HE4:B | 15–20 | ERNAEL | 63.82 | 10 | HHHHHH |
| 4 | 1OUW:D | 42–46 | PIALT | 6.72 | 10 | EEEEE | 1OUW:D | 761–765 | TLAIP | 5.76 | 10 | EEEEE |
Figure 1Schematic representation showing structural overlay of the ‘artificial’ proteins (cyan) modeled on the PDB templates (green); (a) 3BGY_A, (b) 2PKH_H, (c) 2OC5_A and (d) 1OUW_D.
The secondary structure conformations corresponding to the inverted peptide, its corresponding peptide in the native protein, and the peptide amino acid residue side-chains are shown in the panels alongside.
Total number of INVPEPs observed in helix (H), strand (E) and coil (C) conformation among 1625 ‘swappable’ INVPEPS.
| H | E | C | |
| H | 223 | 36 | 3 |
| E | 41 | 40 | 0 |
| C | 1 | 0 | 0 |